Conversation
Signed-off-by: shaoyuheng1 <shaoyuheng1@outlook.com>
There was a problem hiding this comment.
Code Review
This pull request significantly enhances the data annotation pipeline by introducing advanced example selection strategies, including similarity-based, diversity-based, and contrastive learning approaches. It adds support for task-specific guidance, Chain-of-Thought reasoning, self-consistency voting, and social media sentiment analysis. Additionally, the implementation now supports batch processing and parallel annotation requests. Feedback focuses on improving code portability by removing hardcoded absolute paths, cleaning up formatting artifacts (extraneous backslashes), refining broad exception handling, and reducing duplication in API configurations.
| DATA_DIR = '/root/flagos/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data' | ||
| OUTPUT_DIR = '/root/flagos/OpenSeek/openseek/competition/LongContext-ICL-Annotation/outputs' |
There was a problem hiding this comment.
Hardcoding absolute paths specific to a local environment (/root/flagos/...) makes the code non-portable. It is better to use relative paths or environment variables to define these directories.
| DATA_DIR = '/root/flagos/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data' | |
| OUTPUT_DIR = '/root/flagos/OpenSeek/openseek/competition/LongContext-ICL-Annotation/outputs' | |
| DATA_DIR = './data' | |
| OUTPUT_DIR = './outputs' |
| default='/root/flagos/OpenSeek/openseek/competition/LongContext-ICL-Annotation/outputs/', | ||
| help='Prefix path to save the evaluation logs.') | ||
| parser.add_argument('--tokenizer_path', type=str, | ||
| default='/root/flagos/Qwen3-4B') |
| \ | ||
|
|
||
|
|
||
| tokenizer = AutoTokenizer.from_pretrained("/root/flagos/Qwen3-4B", trust_remote_code=True) |
| \ | ||
| \ | ||
| \ |
| except: | ||
|
|
||
| tfidf_matrix = vectorizer.transform(texts) |
There was a problem hiding this comment.
| openai.base_url = "http://localhost:9010/v1/" | ||
| model = "Qwen3-4B-ascend-flagos" |
No description provided.