-
Notifications
You must be signed in to change notification settings - Fork 36
Expand file tree
/
Copy pathquiz.json
More file actions
78 lines (78 loc) · 2.65 KB
/
Copy pathquiz.json
File metadata and controls
78 lines (78 loc) · 2.65 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
{
"lesson": "84-refusal-evaluation",
"title": "顶点课 84 —— Refusal 评估",
"questions": [
{
"stage": "pre",
"question": "under-refusal(拒绝不足)衡量的是什么?",
"options": [
"模型回答被标为 unsafe 的 prompt 的频率",
"模型拒绝被标为 safe 的 prompt 的频率",
"模型每条回复输出多少 token",
"每条 prompt 一次 API 调用的成本"
],
"correct": 0,
"explanation": ""
},
{
"stage": "pre",
"question": "本课为什么用 mock LLM 而不是真实模型?",
"options": [
"因为顶点课里不允许用真实模型",
"因为同样的输入在多次运行中能得到同样的输出,让代码改动能归因到已知原因",
"因为 mock LLM 比真实模型更准",
"因为本课需要 GPU 硬件"
],
"correct": 1,
"explanation": ""
},
{
"stage": "check",
"question": "已知 safe prompt 数 = 30,被拒绝的 safe prompt 数 = 5,over-refusal(过度拒绝)率是多少?",
"options": [
"0.50",
"0.17",
"0.30",
"0.05"
],
"correct": 1,
"explanation": ""
},
{
"stage": "check",
"question": "这里的 Expected Calibration Error(ECE,期望校准误差)衡量的是什么?",
"options": [
"在 unsafe prompt 上拒绝的比例",
"模型声称的置信度与它实际观测到的准确率之间的差距,按置信度区间分桶统计",
"所有回复加起来的总 token 数",
"命中的 regex 规则数量"
],
"correct": 1,
"explanation": ""
},
{
"stage": "check",
"question": "为什么这套框架要把 unsafe prompt 跟第 82 课的 taxonomy 做关联(join)?",
"options": [
"为了重新跑一遍第 82 课的语料加载器",
"为了按攻击分类报告 under-refusal,让团队看清模型在哪条边界上漏掉了",
"为了把每条 prompt 的 severity 都加一",
"为了去重"
],
"correct": 1,
"explanation": ""
},
{
"stage": "post",
"question": "为什么这套框架要同时返回 under-refusal 和 over-refusal,而不是给一个单一的安全分?",
"options": [
"因为 Python dataclass 更喜欢多个字段",
"因为它们是两种相反的错误,一个单一数字会把某个 build 上更严重的那种藏起来",
"因为 under-refusal 只在 CI 里用",
"因为 over-refusal 是另一个团队算的"
],
"correct": 1,
"explanation": ""
}
]
}