swarmscore-spec/draft-stone-swarmscore-v1-00.txt at main · swarmsync-ai/swarmscore-spec · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Internet-Draft                                               Stone et al.
Intended status: Standards Track                             March 2026
Expires: September 2026


         SwarmScore V1: Volume-Scaled Agent Reputation Protocol
                    draft-stone-swarmscore-v1-01

Abstract

   SwarmScore V1 is a transparent, community-governed open standard for
   agent reputation scoring in open marketplaces. It provides a two-
   dimensional scoring system that measures agent performance across
   technical execution (via Conduit browser verification) and commercial
   reliability (via AP2 payment protocol). SwarmScore uses volume-scaled
   metrics to reward consistent, high-volume performance while filtering
   out single-transaction luck. The protocol produces cryptographically
   signed, self-verifiable certificates that marketplaces can use to
   filter, rank, and price agent services.

   This document defines the V1 comprehensive specification: formula,
   trust tiers, escrow integration, wire format, governance model, legal
   framework, implementation guidance, V2 roadmap, competitive analysis,
   and known limitations. It is designed for immediate implementation in
   production agent marketplaces and serves as the foundation for V2
   extensions (multi-pillar scoring, safety testing).

   AUTHORITY = FORMULA x GOVERNANCE. A deterministic formula without
   transparent governance is a private algorithm. SwarmScore V1 provides
   both.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF). Note that other groups may also distribute
   working documents as Internet-Drafts. The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts have a limited lifetime. Expiration will eventually
   force the use of more recent versions. Expires in 6 months.

   This specification is made available under a dual Apache 2.0 / MIT
   license. Implementers may choose either license.

Copyright Notice

   Copyright (c) 2026 SwarmSync Labs. All rights reserved.

---

Table of Contents

   1.  Introduction
   2.  Terminology
   3.  SwarmScore V1 Formula
   4.  Trust Tiers
   5.  Escrow Integration
   6.  Wire Format Specification
   7.  Verification Protocol
   8.  Security Considerations
   9.  References
   10. Governance Model
   11. Legal and Liability Framework
   12. Implementation Guide
   13. V2 Roadmap and Extensions
   14. Competitive Analysis
   15. Known Limitations and Failure Modes
   Appendix A.  Test Vectors (10 Reference Agents)
   Appendix B.  Escrow Modifier Reference Curves
   Appendix C.  Reference Implementation Guide
   Appendix D.  Advisory Board Charter

---

1. Introduction

   AI agents in marketplace environments face a critical trust problem:
   how can buyers confidently transact with agents they have never
   interacted with? Traditional reputation systems (star ratings, review
   text) are slow to accumulate and vulnerable to manipulation.

   SwarmScore V1 solves this by providing a quantitative, real-time
   reputation score computed from two dimensions of agent behavior:

   - Technical Execution (Conduit): The agent's ability to reliably
     execute browser automation tasks. Measured via Conduit protocol
     sessions (https://swarmsync.ai/conduit).

   - Commercial Reliability (AP2): The agent's ability to honor
     agreements and deliver on payment-protocol obligations. Measured
     via AP2 escrow transactions (https://swarmsync.ai/ap2).

   Both dimensions are volume-scaled: an agent with 1 successful
   Conduit session and 100% success rate gets a lower score than an
   agent with 80 successful sessions and 95% success rate. This
   prevents luck from inflating reputation.

   The score is computed deterministically, signed cryptographically,
   and published as a self-verifiable certificate. Buyers and
   marketplaces can check the signature without contacting SwarmScore
   servers, enabling decentralized trust.

   SwarmScore V1 is designed to be simple, auditable, and immediately
   deployable. V2 adds a Safety dimension (canary testing) and a
   multi-pillar framework; V1 provides the foundation.

1.1. Motivation

   Existing agent reputation systems fall into two categories:

   - Implicit (platform-internal): GitHub stars, Hugging Face
     downloads, OpenAI API usage. Fast, but opaque and non-portable.

   - Explicit (review-based): User ratings on Upwork, Fiverr, Kaggle
     Competitions. Transparent, but slow to accumulate and game-
     vulnerable.

   SwarmScore V1 bridges these by providing explicit, cryptographically
   verifiable, real-time scores that:

   - Are computed from objective transaction data (not subjective
     reviews).
   - Can be computed independently (no platform required).
   - Update continuously (new transaction = new score).
   - Are portable (scores follow agents across marketplaces).

1.2. Design Principles

   DETERMINISTIC: Same input always produces same score; no randomness
   or human judgment.

   AUDITABLE: Formula is public; any marketplace can verify scores
   locally.

   VOLUME-COMPENSATED: Success rate alone does not inflate score; high
   volume + high rate = high score.

   CONTINUOUS: Score updates in real-time as new transactions complete.

   PORTABLE: Scores are not platform-specific; they follow the agent.

   CRYPTOGRAPHICALLY SIGNED: Third-party verification possible without
   trust in the issuer.

   GOVERNED: Clear process for updates, disputes, and evolution. The
   Advisory Board (Section 10) manages these processes publicly.

1.3. Authority Claim

   SwarmScore V1 claims to be the universal open standard for agent
   reputation scoring. This claim rests on four pillars, not one:

   (1) Deterministic formula (Section 3)
   (2) Portable certification (Section 6)
   (3) Economic incentive alignment (Section 5)
   (4) Transparent governance (Section 10)

   A scoring system with pillars (1)-(3) only is a private algorithm.
   Adding pillar (4) makes it a community standard. This document
   provides all four.

2. Terminology

   Agent: An AI service provider in the marketplace (may be a model,
   an agentic system, or a human-in-the-loop).

   Conduit Session: A browser automation task executed by an agent
   and verified via Conduit protocol.

   AP2 Transaction: A payment-protocol transaction between a buyer
   agent and a provider agent.

   Escrow: Money held in trust during AP2 transaction.

   Success Rate: Fraction of completed transactions that met buyer's
   stated criteria (deflection, partial, or full compliance as defined
   per transaction).

   Volume Factor: Scaling multiplier based on transaction count;
   increases from 0 to 1 as volume increases.

   SwarmScore: The composite reputation score (0-1000 scale).

   Trust Tier: A label (NONE, STANDARD, ELITE) derived from score and
   volume thresholds.

   Escrow Modifier: A fractional cost (0.25-1.0) applied to escrow
   holds; based on SwarmScore.

   Execution Passport: The signed certificate containing SwarmScore
   and associated metadata.

   Advisory Board: The 5-7 member governance body that oversees this
   specification (Section 10).

   Normative/Informative: Sections marked [NORMATIVE] define protocol
   requirements. Sections marked [INFORMATIVE] provide guidance only.

3. SwarmScore V1 Formula

   [NORMATIVE]

3.1. Inputs

   For a given agent, gather metrics over the last 90 days:

   - conduit_sessions_90d: Total number of Conduit sessions completed.
   - conduit_successful_90d: Number of Conduit sessions marked VERIFIED
     (passed all verification checks).
   - ap2_sessions_90d: Total number of AP2 transactions completed (as
     provider).
   - ap2_successful_90d: Number of AP2 transactions settled successfully
     (not disputed, not refunded).

3.2. Computed Rates

   conduit_rate = conduit_successful_90d / conduit_sessions_90d
                  (if conduit_sessions_90d == 0, conduit_rate = 0)

   ap2_rate = ap2_successful_90d / ap2_sessions_90d
              (if ap2_sessions_90d == 0, ap2_rate = 0)

3.3. Volume Factors

   conduit_volume_factor = min(1.0, conduit_sessions_90d / 100)

   ap2_volume_factor = min(1.0, ap2_sessions_90d / 50)

   Rationale: Conduit sessions are lower-stakes (typically <$1 each);
   100 sessions represents meaningful volume. AP2 transactions are
   higher-stakes (typically >$1k); 50 transactions represents
   meaningful volume. At 100 and 50 respectively, volume factor = 1.0
   and no further scaling occurs.

3.4. Contributions

   conduit_contribution = floor(conduit_rate * conduit_volume_factor
                                * 400)

   ap2_contribution = floor(ap2_rate * ap2_volume_factor * 600)

   Rationale: Maximum contributions are 400 (Conduit) + 600 (AP2) =
   1000 total. AP2 is weighted heavier (600 vs 400) because escrow-
   backed transactions represent higher trust and higher stakes.

3.5. Composite Score

   raw_score = conduit_contribution + ap2_contribution

   swarmscore = max(0, min(1000, raw_score))

   The score is clamped to [0, 1000].

3.6. Escrow Modifier

   The escrow modifier is a continuous curve applied to escrow hold
   amounts during AP2 transactions. Lower scores = higher holds
   (buyer protection); higher scores = lower holds (agent efficiency).

   raw_modifier = 1.0 - (swarmscore / 1250)

   escrow_modifier = max(0.25, min(1.0, raw_modifier))

   This produces a continuous curve:

   - swarmscore = 0    -> escrow_modifier = 0.25 (minimum hold: 25%)
   - swarmscore = 700  -> escrow_modifier ~= 0.44
   - swarmscore = 1000 -> escrow_modifier = 0.25 (asymptotic floor)

   Interpretation: Even high-reputation agents hold a minimum of 25%
   escrow (to prevent griefing). At 0 score, 100% hold would be
   prudent, but we cap at 25% to allow bad agents a path to redemption.

   Note: The escrow modifier floor of 0.25 is a V1 constant. V2 may
   introduce differentiated floors based on Safety pillar score. See
   Appendix B for full curve tables and Section 13 for V2 changes.

4. Trust Tiers

   [NORMATIVE]

   SwarmScore defines three trust tiers based on score and volume:

4.1. NONE Tier

   Condition:
     score < 700 OR conduit_sessions_90d < 50 OR ap2_sessions_90d < 25

   Meaning: Unproven, unreliable, or new. Marketplace may display
   "Unverified" or "New Agent" badge. Subject to additional
   verification requirements before high-stakes transactions.

4.2. STANDARD Tier

   Condition:
     score >= 700 AND conduit_sessions_90d >= 50 AND
     ap2_sessions_90d >= 25

   Meaning: Proven performer. Marketplace may display "Verified" or
   "Standard" badge. Eligible for standard marketplace features
   (search/ranking, standard escrow holds, API tier 2).

4.3. ELITE Tier

   Condition:
     score >= 850 AND conduit_sessions_90d >= 100 AND
     ap2_sessions_90d >= 50

   Meaning: High-reputation agent. Marketplace may display "Elite"
   or "Top" badge. Eligible for premium features (featured listings,
   reduced escrow holds, API tier 3, priority support).

   Note: Tier is re-evaluated continuously. An agent loses ELITE
   status immediately if score drops below 850; STANDARD status is
   lost at score < 700.

5. Escrow Integration

   [NORMATIVE]

   When a buyer initiates an AP2 transaction, the marketplace:

   1. Looks up the provider agent's current SwarmScore.
   2. Computes escrow_modifier (from Section 3.6).
   3. Applies the modifier: hold_amount = escrow_amount * escrow_modifier.
   4. Holds the reduced amount in escrow.
   5. On successful delivery (as verified by buyer), releases the hold
      (minus platform fee).
   6. On dispute, follows standard AP2 adjudication (not covered here).

   Example: A $1,000 escrow with a 0.44 modifier results in a $440
   hold. The remaining $560 is available to the agent immediately
   (to use for their own expenses during the transaction).

   This design incentivizes reputation: high-score agents have lower
   friction and faster cash flow.

6. Wire Format Specification

   [NORMATIVE]

6.1. Execution Passport (Certificate)

   The Execution Passport is a JSON document containing the agent's
   score and metadata, signed with HMAC-SHA256.

   {
     "swarmscore_version": "1.0",
     "agent_passport_id": "uuid-v4",
     "issuer": {
       "platform": "swarmsync.ai",
       "computed_at": "2026-03-17T14:30:00Z",
       "signature": "sha256_hmac_signature_here"
     },
     "score": {
       "value": 759,
       "tier": "STANDARD",
       "conduit_contribution": 304,
       "ap2_contribution": 455
     },
     "dimensions": {
       "technical_execution": {
         "label": "Conduit Execution",
         "sessions_90d": 80,
         "successful_sessions_90d": 76,
         "success_rate": 0.95,
         "volume_factor": 0.80,
         "max_contribution": 400,
         "actual_contribution": 304
       },
       "commercial_reliability": {
         "label": "AP2 Reliability",
         "sessions_90d": 40,
         "successful_sessions_90d": 38,
         "success_rate": 0.95,
         "volume_factor": 0.80,
         "max_contribution": 600,
         "actual_contribution": 456
       }
     },
     "escrow_modifier": 0.3928,
     "qualification_gaps": [],
     "formula_version": "1.0",
     "expires_at": "2026-03-24T14:30:00Z"
   }

6.2. Signature Computation

   signature = HMAC-SHA256(
     key = SWARMSCORE_SIGNING_KEY,
     message = JSON_CANONICAL_FORM(passport_object_minus_signature)
   )

   JSON canonical form: sorted keys, no whitespace, UTF-8 encoding.
   Signature is hex-encoded in the "signature" field.

6.3. Verification

   Any third party can verify the certificate:

   1. Extract the "signature" field.
   2. Reconstruct the JSON canonical form (sorting keys, removing
      whitespace).
   3. Compute: expected_signature = HMAC-SHA256(SIGNING_KEY, message).
   4. Compare: if expected_signature == signature, certificate is valid.

   Note: The SIGNING_KEY is a marketplace secret. Third-party
   verification requires the marketplace to provide the key (or a
   public-key equivalent in V2+).

7. Verification Protocol

   [NORMATIVE]

7.1. Three Levels of Trust

   L1 (Lightweight): Client checks signature against SWARMSCORE_
   SIGNING_KEY. Confirms certificate has not been tampered.

   L2 (Strong): Client re-computes the score from transaction data
   (if available locally) and compares to certificate score. Confirms
   the certificate's score is correct.

   L3 (Full Audit): Third-party auditor (e.g., buyer's security team)
   contacts SwarmScore to request transaction logs, verifies the 90-day
   metrics, re-computes the score, and compares. Confirms no tampering
   at the source.

   Typical marketplaces perform L1 (lightweight). High-stakes
   transactions may require L2 or L3.

7.2. Certificate Endpoint (Optional)

   Marketplaces may expose: GET /swarmscore/{agent_id}/certificate

   Returns the current Execution Passport for the agent. Includes
   timestamp (computed_at) and expiration (expires_at). Certificates
   are valid for 7 days; expired certificates may be re-issued.

7.3. Verification Endpoint (Optional)

   Marketplaces may expose: POST /swarmscore/verify

   Body:
   {
     "certificate": { ... full Execution Passport JSON ... },
     "agent_id": "agent-uuid"
   }

   Response:
   {
     "valid": true,
     "signature_valid": true,
     "score_valid": true,
     "expires_at": "2026-03-24T14:30:00Z",
     "detected_tampering": false
   }

8. Security Considerations

   [NORMATIVE]

8.1. Signature Key Management

   The SWARMSCORE_SIGNING_KEY is a shared secret (32+ bytes). It MUST:

   - Be stored securely (environment variable, secure key store).
   - Rotate annually (issue new key, recompute all certificates).
   - Not be embedded in client code (only in backend).
   - Be different between production and staging environments.

   Loss of the signing key allows anyone to forge certificates.
   Compromise of the key invalidates all signatures.

8.2. Score Computation Data Sources

   SwarmScore relies on transaction data (Conduit sessions, AP2
   transactions) provided by the marketplace. If this data is corrupted
   or falsified, scores become meaningless.

   Marketplaces MUST:

   - Audit Conduit session verification (ensure VERIFIED status is
     correct).
   - Audit AP2 transaction settlement (ensure successful_90d count is
     accurate).
   - Implement write-once transaction logs (prevent retroactive
     modification).

8.3. Gaming and Manipulation

   Several attacks on SwarmScore are theoretically possible:

   - Volume Farming: An agent completes many low-value transactions to
     inflate volume. Mitigation: Marketplace monitors for transaction
     clustering; flags suspiciously similar transactions.

   - Success Rate Gaming: An agent only accepts easy transactions to
     maintain high success rate. Mitigation: Marketplace tracks
     rejection rates; flags agents with unusually low rejection rates.

   - Timestamp Manipulation: Agent back-dates transactions to fit the
     90-day window. Mitigation: Marketplace uses blockchain-style
     timestamping (hash chains, external notarization).

   These are operator-level attacks (require marketplace complicity or
   compromise). The protocol itself is sound. See Section 15 for a
   full treatment of known limitations and attack vectors.

8.4. Privacy Considerations

   SwarmScore certificates contain metrics (session counts, success
   rates) but not transaction details (who, what, how much). Metrics
   are aggregated over 90 days, reducing linkability to individual
   transactions.

   However, agents may prefer to keep their scores private. The
   certificate format allows for:

   - Signed certificates (portable, provable).
   - Unlisted certificates (stored server-side, not published).

   Marketplaces should provide agents a choice.

9. References

9.1. Normative References

   [RFC2104]  Krawczyk, H., Bellare, M., Bergman, R., Jain, R., and
              R. Kohno, "HMAC: Keyed-Hashing for Message
              Authentication", RFC 2104, DOI 10.17487/RFC2104,
              February 1997.

   [RFC2026]  Bradner, S., "The Internet Standards Process --
              Revision 3", BCP 9, RFC 2026, DOI 10.17487/RFC2026,
              October 1996.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in
              RFC 2119 Key Words", BCP 14, RFC 8174,
              DOI 10.17487/RFC8174, May 2017.

   [CONDUIT]  SwarmSync Labs, "Conduit: Cryptographically-Audited
              Browser Automation Protocol", 2026.
              https://swarmsync.ai/conduit

   [AP2]      SwarmSync Labs, "Agent Payment Protocol (AP2)", 2026.
              https://swarmsync.ai/ap2

9.2. Informative References

   [RFC3394]  Schaad, J. and R. Housley, "Advanced Encryption Standard
              (AES) Key Wrap Algorithm", RFC 3394,
              DOI 10.17487/RFC3394, September 2002.

   [RFC4648]  Josefsson, S., "The Base16, Base32, and Base64 Data
              Encodings", RFC 4648, DOI 10.17487/RFC4648,
              October 2006.

   [RFC6749]  Hardt, D., "The OAuth 2.0 Authorization Framework",
              RFC 6749, DOI 10.17487/RFC6749, October 2012.

   [RFC7231]  Fielding, R. and J. Reschke, "Hypertext Transfer Protocol
              (HTTP/1.1): Semantics and Content", RFC 7231,
              DOI 10.17487/RFC7231, June 2014.

   [RFC8126]  Cotton, M., Leiba, B., and T. Narten, "Guidelines for
              Writing an IANA Considerations Section in RFCs", BCP 26,
              RFC 8126, DOI 10.17487/RFC8126, June 2015.

   [ATEP]     SwarmSync Labs, "Agent Trust and Execution Passport
              (ATEP)", 2026. https://swarmsync.ai/atep

   [CANARY]   Stone et al., "SwarmScore V2 Canary: Safety-Aware Agent
              Reputation", draft-stone-swarmscore-v2-canary-01, 2026.

---

10. Governance Model

    [INFORMATIVE]

    This section defines the governance process for SwarmScore V1. A
    scoring protocol without governance is a private algorithm with a
    public formula. Governance provides the transparency, accountability,
    and update processes that make SwarmScore a community standard.

10.1. Why Governance Matters

    Consider two protocols with identical formulas:

    Protocol A: Formula is published. Updates made by single company.
    Protocol B: Formula is published. Updates made via public process.

    Protocol A gives marketplaces no input into formula changes that
    affect their operations. A company could silently change the formula
    to favor certain agents. Marketplaces would have no recourse.

    Protocol B gives marketplaces a voice. Formula changes are public,
    debated, and versioned. Marketplaces can forecast the impact of
    changes. Disputes have a formal process.

    SwarmScore V1 follows Protocol B.

10.2. Advisory Board Structure

    The SwarmScore Advisory Board consists of 5-7 members:

    - 1 seat: Marketplace CTO (rotating, nominated by marketplace
      operators annually)
    - 1 seat: IETF representative (nominated by IETF chair)
    - 1 seat: W3C AI Agent Protocol Community Group delegate
    - 1 seat: Independent security researcher (nominated by board,
      confirmed by public vote)
    - 1 seat: Agent developer representative (elected by registered
      agent developers)
    - 1-2 seats (optional): Domain experts appointed by existing board
      members (legal, economics, or technical specialists)

    Terms: 1 year, renewable twice (maximum 3 consecutive years).

    Conflicts of interest: Board members must disclose and recuse from
    decisions directly affecting their employer's marketplace position.

    Compensation: Board members serve as volunteers. Travel expenses
    for in-person meetings (at most 2 per year) may be reimbursed from
    the SwarmScore Foundation operating budget.

10.3. Standards Update Process

    Formula changes follow an RFC 2026-style process:

    Stage 1 - PROPOSAL: Any community member may submit a Proposal for
    Change (PFC) to the public mailing list at standards@swarmsync.ai.
    A PFC must include: (a) the proposed change, (b) rationale, (c)
    impact analysis, (d) migration path.

    Stage 2 - REVIEW: The Advisory Board assigns a reviewer within 14
    days. A public comment period of 30 days opens. All comments are
    archived at https://swarmsync.ai/governance/proposals/.

    Stage 3 - VOTE: After the comment period, the Advisory Board votes.
    Simple majority (>=4/7 or >=3/5) required for minor changes.
    Supermajority (>=6/7 or >=4/5) required for breaking changes
    (changes that alter score values for existing agents).

    Stage 4 - PUBLICATION: Approved changes are published as a new
    version of this document. Version numbering: V1.0 -> V1.1 (minor
    change) -> V2.0 (major/breaking change). V1.x changes MUST be
    backwards-compatible (they may add new features; they MUST NOT
    change the formula for V1 data).

    Stage 5 - IMPLEMENTATION: Marketplaces are given 90 days to
    implement approved changes. During this period, both old and new
    formula versions MUST be accepted. After 90 days, marketplaces
    SHOULD upgrade.

10.4. No Hidden Methodology Changes

    The Advisory Board makes the following binding commitments:

    1. No formula changes without public process (Section 10.3).
    2. No undisclosed Advisory Board conflicts of interest.
    3. All board meeting notes published within 14 days of meetings.
    4. All community comments acknowledged within 7 days.
    5. Vote results published with individual board member positions
       (no anonymous votes on formula changes).

    These commitments are enforced by the community: any marketplace or
    agent developer may call for a public audit of compliance with these
    commitments via the public mailing list.

10.5. Licensing

    This specification is dual-licensed: Apache 2.0 and MIT. Implementers
    may choose either license. The SwarmScore formula and wire format are
    free to implement commercially without royalties. The "SwarmScore"
    trademark may be used by compliant implementations.

    Compliant implementations: An implementation is "SwarmScore
    Compliant" if it implements Sections 3-8 of this document without
    modification to the formula parameters. Advisory Board certifies
    compliance on request.

10.6. Rapid Evolution Process

    If a regulatory change (e.g., SEC, EU AI Act) requires formula
    modification, the Advisory Board may invoke "Emergency Update
    Protocol":

    - 7-day public comment period (instead of 30 days)
    - Board vote by electronic ballot within 3 days of comment close
    - Published as V1.x.hotfix within 48 hours of vote

    Emergency protocol requires supermajority vote to invoke
    (6/7 or 4/5 board members).

---

11. Legal and Liability Framework

    [INFORMATIVE]

    This section addresses legal considerations for marketplace
    operators implementing SwarmScore. It is informative, not normative:
    operators should consult qualified legal counsel for their specific
    jurisdiction.

11.1. Agent Consent and Disclosure Requirements

    Operators implementing SwarmScore SHOULD:

    - Disclose to agents that their transaction data (Conduit sessions,
      AP2 transactions) is used to compute a public reputation score.

    - Obtain explicit consent from agents before their first scored
      transaction, including: (a) what data is used, (b) how the score
      is computed, (c) where the score is published, (d) how to appeal.

    - Provide agents access to their own score data and the transaction
      records underlying it.

    - Honor requests to suppress public display of score (though the
      score may still be computed and used internally for escrow
      purposes).

    Sample disclosure language:

      "Your transactions on this marketplace are used to compute a
      SwarmScore, an open-standard reputation score. The formula is
      public at [URL]. Your score affects escrow hold amounts and
      marketplace visibility. You may review your score and the data
      behind it at any time. Contact us to dispute errors."

11.2. Liability Limitations for Marketplaces

    SwarmScore is a reputational signal, not a guarantee of
    performance. Marketplaces implementing SwarmScore:

    - Are NOT liable for agent failures where the agent holds a high
      SwarmScore (the score reflects historical performance, not future
      guarantees).

    - SHOULD include a liability disclaimer on score display pages and
      in their terms of service.

    - SHOULD carry Errors & Omissions (E&O) insurance covering claims
      arising from agent marketplace decisions (including score-based
      decisions). Recommended minimum: $1M per occurrence.

    Sample liability disclaimer:

      "SwarmScore is a historical performance metric computed from
      objective transaction data. It does not guarantee future
      performance. This marketplace is not liable for losses arising
      from transactions with any agent regardless of SwarmScore."

11.3. Appeals Process

    Any agent who believes their score is computed incorrectly may
    appeal to the issuing marketplace:

    Grounds for appeal: Data errors only. Valid grounds include:
    - Incorrect transaction count (e.g., Conduit sessions not counted)
    - Incorrect success/failure status (e.g., VERIFIED session marked
      FAILED)
    - Identity error (sessions attributed to wrong agent ID)

    Invalid grounds for appeal: Score formula disagreement. The formula
    is the formula; agents cannot appeal that their 85% success rate
    should be weighted differently.

    Process:
    1. Agent submits appeal in writing to appeals@[marketplace] within
       30 days of score computation.
    2. Marketplace acknowledges within 7 days.
    3. Marketplace investigates and responds within 21 days.
    4. If appealing agent disagrees with outcome, they may escalate to
       the Advisory Board (advisory@swarmsync.ai) for a final, binding
       decision on data accuracy questions.

    Correction timeline: If a data error is confirmed, the score MUST
    be recomputed and the corrected Execution Passport re-issued within
    7 days of confirmation.

11.4. Privacy and Data Regulations

    SwarmScore certificates contain aggregate metrics (session counts,
    success rates) and NOT transaction-level personal data. However,
    agent identifiers (agent_passport_id) may constitute personal data
    under GDPR/CCPA if the agent is a natural person.

    Operators SHOULD:

    - Store SwarmScore computation data in compliance with applicable
      data retention regulations (typically: retain for 90 days
      post-score-computation, then aggregate and anonymize).

    - Provide data subject access and deletion requests for agents who
      are natural persons, consistent with GDPR Article 17 and
      CCPA Section 1798.105.

    - Document their SwarmScore data processing as a legitimate
      business interest (reputation scoring for commercial transactions)
      in their privacy impact assessment.

    The SwarmScore certificate itself (the Execution Passport) contains
    only aggregate metrics and a hashed identifier; it does not contain
    transaction-level personal data and may be retained indefinitely.

---

12. Implementation Guide

    [INFORMATIVE]

    This section provides implementation guidance for operators. It is
    informative, not normative; the formula in Section 3 is normative.

12.1. Database Schema (AP2 / PostgreSQL)

    The following schema supports SwarmScore V1 computation:

    -- Core tables required for SwarmScore computation

    CREATE TABLE conduit_sessions (
      id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
      agent_id        UUID NOT NULL REFERENCES agents(id),
      operator_id     UUID NOT NULL REFERENCES operators(id),
      started_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
      completed_at    TIMESTAMPTZ,
      status          VARCHAR(16) NOT NULL CHECK (
                        status IN ('PENDING', 'RUNNING', 'VERIFIED',
                                   'FAILED', 'ERROR', 'TIMEOUT')),
      action_count    INTEGER DEFAULT 0,
      session_cost_usd NUMERIC(10,4) DEFAULT 0,
      proof_hash      VARCHAR(64),  -- SHA-256 of proof bundle
      created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
    );

    CREATE INDEX idx_conduit_sessions_agent_date
      ON conduit_sessions (agent_id, completed_at)
      WHERE status IN ('VERIFIED', 'FAILED');

    CREATE TABLE ap2_transactions (
      id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
      provider_id     UUID NOT NULL REFERENCES agents(id),
      buyer_id        UUID REFERENCES agents(id),
      operator_id     UUID NOT NULL REFERENCES operators(id),
      status          VARCHAR(16) NOT NULL CHECK (
                        status IN ('NEGOTIATING', 'HELD', 'EXECUTING',
                                   'DELIVERED', 'SETTLED', 'DISPUTED',
                                   'REFUNDED', 'CANCELLED')),
      escrow_amount_usd NUMERIC(12,2) NOT NULL,
      escrow_modifier NUMERIC(4,4),
      started_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
      settled_at      TIMESTAMPTZ,
      created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
    );

    CREATE INDEX idx_ap2_transactions_provider_date
      ON ap2_transactions (provider_id, settled_at)
      WHERE status IN ('SETTLED', 'DISPUTED', 'REFUNDED');

    -- SwarmScore cache table (recomputed every 24 hours or on new
    -- transaction)
    CREATE TABLE swarmscore_cache (
      agent_id            UUID PRIMARY KEY REFERENCES agents(id),
      score               INTEGER NOT NULL CHECK (score BETWEEN 0 AND 1000),
      tier                VARCHAR(8) NOT NULL CHECK (
                            tier IN ('NONE', 'STANDARD', 'ELITE')),
      conduit_sessions_90d INTEGER NOT NULL DEFAULT 0,
      conduit_successful_90d INTEGER NOT NULL DEFAULT 0,
      ap2_sessions_90d     INTEGER NOT NULL DEFAULT 0,
      ap2_successful_90d   INTEGER NOT NULL DEFAULT 0,
      conduit_contribution INTEGER NOT NULL DEFAULT 0,
      ap2_contribution    INTEGER NOT NULL DEFAULT 0,
      escrow_modifier     NUMERIC(4,4) NOT NULL,
      computed_at         TIMESTAMPTZ NOT NULL DEFAULT NOW(),
      expires_at          TIMESTAMPTZ NOT NULL
        DEFAULT NOW() + INTERVAL '7 days',
      passport_json       JSONB,
      passport_signature  VARCHAR(64)
    );

12.2. Score Computation Pseudocode

    The following pseudocode is normative in intent; implementers should
    match the behavior exactly (see Appendix A for test vectors).

    function compute_swarmscore(agent_id, as_of_date):
      # 1. Gather 90-day window
      window_start = as_of_date - 90 days
      window_end   = as_of_date

      # 2. Count Conduit sessions
      conduit_total = COUNT(conduit_sessions
        WHERE agent_id = agent_id
          AND completed_at BETWEEN window_start AND window_end
          AND status IN ('VERIFIED', 'FAILED'))

      conduit_success = COUNT(conduit_sessions
        WHERE agent_id = agent_id
          AND completed_at BETWEEN window_start AND window_end
          AND status = 'VERIFIED')

      # 3. Count AP2 transactions
      ap2_total = COUNT(ap2_transactions
        WHERE provider_id = agent_id
          AND settled_at BETWEEN window_start AND window_end
          AND status IN ('SETTLED', 'DISPUTED', 'REFUNDED'))

      ap2_success = COUNT(ap2_transactions
        WHERE provider_id = agent_id
          AND settled_at BETWEEN window_start AND window_end
          AND status = 'SETTLED')

      # 4. Compute rates (handle zero-division)
      conduit_rate = conduit_success / conduit_total  if conduit_total > 0 else 0
      ap2_rate     = ap2_success / ap2_total          if ap2_total > 0 else 0

      # 5. Compute volume factors
      conduit_vf = min(1.0, conduit_total / 100)
      ap2_vf     = min(1.0, ap2_total / 50)

      # 6. Compute contributions (floor is integer floor division)
      conduit_contrib = floor(conduit_rate * conduit_vf * 400)
      ap2_contrib     = floor(ap2_rate * ap2_vf * 600)

      # 7. Composite score (clamped)
      raw_score   = conduit_contrib + ap2_contrib
      score       = max(0, min(1000, raw_score))

      # 8. Escrow modifier
      raw_modifier    = 1.0 - (score / 1250.0)
      escrow_modifier = max(0.25, min(1.0, raw_modifier))

      # 9. Trust tier
      if score >= 850 AND conduit_total >= 100 AND ap2_total >= 50:
        tier = 'ELITE'
      elif score >= 700 AND conduit_total >= 50 AND ap2_total >= 25:
        tier = 'STANDARD'
      else:
        tier = 'NONE'

      return {
        score, tier, conduit_contrib, ap2_contrib,
        conduit_total, conduit_success, conduit_rate, conduit_vf,
        ap2_total, ap2_success, ap2_rate, ap2_vf,
        escrow_modifier
      }

12.3. Common Implementation Pitfalls

    Pitfall 1 - FLOATING-POINT ROUNDING:
    Problem: Using float arithmetic produces results like 303.9999 which
    floor() rounds to 303 instead of 304.
    Fix: Use Python's Decimal or JavaScript's Number.EPSILON-safe
    rounding. The floor() operation is on the final product, not
    intermediate values.

    Pitfall 2 - ZERO-SESSION EDGE CASE:
    Problem: When conduit_sessions_90d = 0, some implementations compute
    rate as NaN or raise division-by-zero errors.
    Fix: Explicitly check for zero before division. If zero sessions,
    set rate = 0 and volume_factor = 0. Score contribution = 0.

    Pitfall 3 - TIMEZONE BUGS:
    Problem: The 90-day window uses "completed_at" which may be stored
    in local timezone rather than UTC. Sessions near the boundary may
    be included or excluded inconsistently.
    Fix: Store all timestamps in UTC. Compute the 90-day window as:
    window_start = CURRENT_TIMESTAMP AT TIME ZONE 'UTC' - INTERVAL '90 days'

    Pitfall 4 - STATUS ENUM INCLUSION ERRORS:
    Problem: Including TIMEOUT or ERROR sessions in "total" count
    artificially inflates the denominator, reducing success rate.
    Fix: Only include VERIFIED and FAILED in conduit_total (excluding
    ERROR, TIMEOUT, PENDING, RUNNING). Only include SETTLED, DISPUTED,
    and REFUNDED in ap2_total.

    Pitfall 5 - ESCROW MODIFIER FLOOR BYPASS:
    Problem: Some implementations skip the floor clamping and produce
    escrow_modifier = 0.20 for score = 1000.
    Fix: Always apply max(0.25, min(1.0, raw_modifier)). The floor is
    0.25, not 0.0.

    Pitfall 6 - STALE CACHE SERVING:
    Problem: Score cached at T=0 served for a new transaction at T=7d
    where the score has changed.
    Fix: Cache expires in 7 days (expires_at field). On cache miss or
    expiry, recompute. On new VERIFIED Conduit session or SETTLED AP2
    transaction, invalidate cache immediately.

    Pitfall 7 - TIER EVALUATION ORDER:
    Problem: Evaluating STANDARD before ELITE causes agents meeting ELITE
    criteria to be labeled STANDARD.
    Fix: Always evaluate ELITE condition first, then STANDARD, then NONE.