Hi,
I have a few models that return structured output by utilizing special tokens as delimiters. As of now, vLLM always skips special tokens during decoding. Would it be possible to add skip_special_tokens as a generation parameter?
TGI sort of supports this by giving you the option to return individual tokens with their IDs and a boolean indicating whether they are special or not.
Hi,
I have a few models that return structured output by utilizing special tokens as delimiters. As of now, vLLM always skips special tokens during decoding. Would it be possible to add
skip_special_tokensas a generation parameter?TGI sort of supports this by giving you the option to return individual tokens with their IDs and a boolean indicating whether they are special or not.