docs: DOC-1234: Core docs Python-Groovy difference checker tool#7859
docs: DOC-1234: Core docs Python-Groovy difference checker tool#7859elijahpetty wants to merge 3 commits intodeephaven:mainfrom
Conversation
No docs changes detected for d91d2b5 |
There was a problem hiding this comment.
The name of this file and the output file name give me pause. Why all of the 5 stuff? Why have versions? Should this stuff be in docs/tools or docs/bin? I would argue that it doesn't belong in docs.
There was a problem hiding this comment.
Also, the name does not tell the users what it does. Something like compare_docs_py_java.py would be far more obvious about the purpose of the script.
| _EQUIVALENT_LINK_SPECS: list[tuple[list[str], str, list[str] | None]] = [ | ||
| (["empty_table", "emptyTable"], "emptyTable", None), | ||
| (["new_table", "newTable"], "newTable", None), | ||
| (["multi_join", "multiJoin", "MultiJoin.of"], "MultiJoin", ["multi-join", "multiJoin", "multijoin"]), | ||
| (["range_join", "rangeJoin"], "rangeJoin", ["range-join", "rangeJoin"]), | ||
| (["keyed_transpose", "keyedTranspose"], "keyedTranspose", ["keyed-transpose", "keyedTranspose"]), | ||
| (["partitioned_transform", "partitionedTransform"], "partitionedTransform", ["partitioned-transform", "partitionedTransform"]), | ||
| (["one_click", "oneClick"], "oneClick", ["one-click", "oneClick"]), | ||
| (["has_data_index", "hasDataIndex"], "hasDataIndex", ["has-data-index", "hasDataIndex"]), | ||
| (["meta_table", "metaTable", "meta"], "meta", ["meta_table", "metaTable", "meta"]), | ||
| ] |
There was a problem hiding this comment.
Most of the 3rd entries are just snake -> kebab case transformations. This could just be a function instead of more data to manage. Having said that, the first two do not have this mapping. I'm not sure why, but it jumps out at me.
| def should_exclude_reference_path(rel_path_str: str) -> bool: | ||
| """Return True if this relative path should be excluded due to being in /reference. |
There was a problem hiding this comment.
This is a good example of a simple place where function naming and pydocs make it very confusing for me as a reviewer. Questions that come to mind:
- Why do references need to be excluded?
- Why should references be excluded?
- Are only some references being excluded?
- etc.
To learn how to improve this, lean on AI. In windsurf try:
- "Improve all of the pydocs in this file. Use the Google standard."
- "as a senior developer, review all of the function names in this file. Teach me how to improve them. Do not change the code."
- Switch to planning mode. "As a senior developer, carefully review this file."
There was a problem hiding this comment.
I excluded reference docs for two reasons. The first is that the method names are so different in so many cases that I would need to create a map of which docs to compare. The second is that the actual methods work differently in so many cases that the text is different as a result. Margaret and I thought it was not worth comparing all of those just to make minor phrasing changes when much of the text will be different either way.
There was a problem hiding this comment.
I reviewed this file, but my review is a high-level review.
- Nobody loses a billion dollars if there is a bug in this.
- regular expression code is hard to review.
- The way this code is structured, named, and documented makes it extremely difficult to follow.
Having said that, my comments are mostly about the CS aspects of the code that will help you become a better developer. We should iterate a few times to help you learn. Between these iterations, you can lean on AI to help teach you. I have included detailed notes in one of the comments. Ask windsurf things like:
- "As a senior developer, give me a short 1 paragraph analysis of how this code could be better."
Then once you have feedback, you can more aggressively have Windsurf review and improve the code. Do this in planning mode so that you can get feedback on what it wants to change, and you can change the plan if you don't like it.
compare_docs_v5.py compares the docs, and create_annotated_comparison.py makes a .md file with easy-to-read output.