Skip to content

ccc init wizard misleads users into selecting sentence-transformers for local Ollama setups, causing crashes #181

@jimmckeeth

Description

@jimmckeeth

The interactive setup prompt for ccc init incorrectly labels providers, which misleads users trying to configure a local Ollama model. The CLI implies sentence-transformers is the choice for local models, while labeling litellm as a cloud option.

Steps to Reproduce:

  1. Run ccc init.
  2. Observe the prompt:
? Embedding provider (Use arrow keys)
 » sentence-transformers (local, free)
   litellm (cloud, 100+ providers)
  1. Select sentence-transformers (because I want "local/free" and I don't have an API key).
  2. Enter ollama/nomic-embed-text as the model.
  3. The application crashes because sentence-transformers attempts to process a LiteLLM/Ollama prefix as a HuggingFace identifier.

And to top it off the documentation is useless. ollama/snowflake-arctic-embed is mentioned under LineLLM but not under sentence-transformers, but the default of Snowflake/snowflake-arctic-embed-xs isn't mentioned anyplace in the docs. Even the README here says "LiteLLM-only; requires a cloud embedding provider and API key."

And to top it all off, if I run ccc init again ccc reset --all it doesn't help. I have to either delete .cocoindex_code/global_settings.yml or edit it manually. Thankfully ccc doctor includes a mention of that settings file after scrolling a couple of pages of crash messages.


Expected Behavior:

  • The init walk through should clarify that litellm is required for local Ollama proxying.
  • 'Init' should test the configuration before writing it to a hidden config file that just crashes.
  • An option for ccc reset --global or ccc init --global to repeat the global init again.
  • ccc doctor should trap the crash and provide suggested troubleshooting steps (like running ccc init --global again.)

Suggested Fixes:

  • Change the litellm label to something like litellm (cloud APIs & local Ollama).
  • Add input validation that throws a clear warning or error immediately during the prompt if a user selects sentence-transformers but enters a model starting with ollama/.
    • Better yet, have a list of options the user can scroll through, with a free form "other".
    • Then test the selection to make sure it is valid.
  • Documentation that explains this at least a little.
    • Is Snowflake/snowflake-arctic-embed-xs the only sentence-transformers option?
    • CocoIndex-Code doesn't have any documentation (that I could fine) beyond the README and the landing page. So users end up reading the CocoIndex documentation and then have to figure out what CocoIndex-Code does from the source code.

I understand the CCC is made with the full CocoIndex, and if I look at the full CCC source code and then read the CocoIndex docs I can figure it out, but when I have an install guide and guided init, I expect it to work.

Don't get me wrong, I'm a big fan of CCC. I did a lunch and learn presentation on it at work, and am planning another presentation on it in a couple of weeks for a different group. But all three times I've installed it now it crashed out and I had to edit the global_settings.yml settings file manually. I ended up creating an installation guide for coworkers because a number of them had trouble installing it too (partially due to annoying corporate networking configuration, but also because of the reasons described above.)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions