Prepare initial release (#1), drop Python < 3.12 and Django support (#2)#3
Prepare initial release (#1), drop Python < 3.12 and Django support (#2)#3
Conversation
+ gitignore venv
rlskoeser
left a comment
There was a problem hiding this comment.
I've done an initial review and tested locally. I also tested updating a project that uses eulxml to use a local install of this branch and it works fine. 🎉
Is there an easy one-line shell command to rewrite eulxml to neuxml everywhere in a set of files? Can we add it somewhere in the readme?
I'll respond to your other questions in a separate comment.
| 0.1.0 | ||
| ----- | ||
|
|
||
| * Require lxml 3.4 for ``collect_ids`` feature used in duplicate id | ||
| support added in eulxml 1.1.2 | ||
| * Fork package with the new name `neuxml` | ||
| * Remove `forms` submodule and drop Django requirements | ||
| * Add GitHub workflow for pypi publication | ||
| * Update for Python 3.12 compatibility |
There was a problem hiding this comment.
Thanks for truncating the change log - I think that's the right call. Probably need a little extra narration here, but that's probably my job.
I think we should list what version of eulxml we forked from and link to the old repo, but not include it as a version in the changelog.
| This codebase was forked from a package called **eulxml**, originally developed | ||
| by Emory University Libraries. To see and interact with the full development | ||
| history of **eulxml**, see `eulxml <https://github.com/emory-libraries/eulxml>`_ | ||
| and `eulcore-history <https://github.com/emory-libraries/eulcore-history>`_. |
There was a problem hiding this comment.
Let's drop the eulcore-history link
| nosetests # for normal development | ||
| nosetests --with-coverage --cover-package=eulxml --cover-xml --with-xunit # for continuous integration | ||
| nosetests --with-coverage --cover-package=neuxml --cover-xml --with-xunit # for continuous integration |
There was a problem hiding this comment.
How much effort to switch to pytest?
There was a problem hiding this comment.
I ran this script and it actually didn't need to make any changes! 🤯
| .. image:: https://readthedocs.org/projects/neuxml/badge/?version=latest | ||
| :target: http://neuxml.readthedocs.org/en/latest/?badge=latest |
There was a problem hiding this comment.
This won't exist yet, but I guess we should probably set up readthedocs for this project.
Maybe create an issue and remove the badge until it exists? Should it be part of initial release?
| **code** | ||
| .. image:: https://travis-ci.org/emory-libraries/eulxml.svg | ||
| .. image:: https://travis-ci.org/Princeton-CDH/neuxml.svg | ||
| :alt: travis-ci build | ||
| :target: https://travis-ci.org/emory-libraries/eulxml | ||
| :target: https://travis-ci.org/Princeton-CDH/neuxml | ||
|
|
||
| .. image:: https://coveralls.io/repos/github/emory-libraries/eulxml/badge.svg | ||
| :target: https://coveralls.io/github/emory-libraries/eulxml | ||
| .. image:: https://coveralls.io/repos/github/Princeton-CDH/neuxml/badge.svg | ||
| :target: https://coveralls.io/github/Princeton-CDH/neuxml | ||
| :alt: Code Coverage | ||
|
|
||
| .. image:: https://codeclimate.com/github/emory-libraries/eulxml/badges/gpa.svg | ||
| :target: https://codeclimate.com/github/emory-libraries/eulxml | ||
| .. image:: https://codeclimate.com/github/Princeton-CDH/neuxml/badges/gpa.svg | ||
| :target: https://codeclimate.com/github/Princeton-CDH/neuxml | ||
| :alt: Code Climate | ||
|
|
||
|
|
||
| .. image:: https://requires.io/github/emory-libraries/eulxml/requirements.svg | ||
| :target: https://requires.io/github/emory-libraries/eulxml/requirements | ||
| .. image:: https://requires.io/github/Princeton-CDH/neuxml/requirements.svg | ||
| :target: https://requires.io/github/Princeton-CDH/neuxml/requirements |
There was a problem hiding this comment.
I think this will all go away - we're not using most of these anymore, and the ones we are using aren't that reliable.
We can remove and replace with GitHub Actions test / codeql badges when we add them.
|
Answers to your questions:
I'd like to use git flow for this repo, but not sure how that works for the initial setup. Maybe switch to that after the initial cleanup / conversion?
That seems fine for now... we can always skip some versions. I think the initial stable release should probably be 1.0 but I like getting an early 0.1 out to pypi asap.
I should probably check with the PUL copyright librarian about this! I know that the license allows us to fork it, and I think it will be mixed copyright - we should properly be applying that header to all of our source code for our CDH projects. Leave as is for now.
Well, it's pretty terrible if any of the unit tests are hitting real servers (I'm pretty sure we tried to avoid that in the original unit tests! but it's been a while) - that might even be the reason they turned off access. I'd love to figure out a solution for caching these locally. Can you identify any other unit tests that are hitting live servers? As a first step maybe we just use
Switching to I haven't done anything like that in Hatchling yet. Can you look first and see if there's a way to get rid of this step? If we can cache XSDs with the package somehow would we no longer need it? (We might want to make a new issue to track this)
I've already started using this on other CDH projects that are published on PyPI. Hopefully I will be able to remember what I did and get it set up for this project as well. (I'll have to figure out how to do it for a new project... When I did it before I was updating existing projects.) How soon do you think we will need that?
I usually do it on new release - that way it's based on the tag but you have to opt in. Sound reasonable? |
|
Ok, I think I've pretty much handled everything except the GitHub action and setting up
Looks like there were 6 of them, so I skipped them all for now.
As soon as we're ready to release on PyPI, which could be in the next couple of days if we want, unless we decide to go with a different auth method first.
Makes sense to me! |
|
@rlskoeser Something interesting with the tests and fetching by URL.
Easiest route here would be to just update On the one hand, that would be dependent on the XML catalog for a test to not hit a real server. On the other hand, I guess if the catalog is now always present in the package, then it's true that Do you have a preference here? |
|
@blms Two thoughts:
If we implement 2 now then fixing https support can wait. |
|
@rlskoeser Makes sense! We do have |
|
@rlskoeser Realizing now that several of the schemas (at least OAI:DC and MODS) actually have additional internal references to remote schemas (for example these imports), which were never stored in the xml catalog—so even mocking by always referencing the local files will still result in web requests, called by Maybe my approach to this is slightly wrong. Right now I'm globally shimming I also found another case that I missed (by just turning off my internet connection) which is that we use |
|
FWIW, looks like it is possible to block all network requests using pytest-socket, if we decide to go that route. I also realized that if we only rely on the XML Catalog, |
|
@blms I like the easy option of blocking all network requests for unit tests. Thanks for finding that. Any concerns about that solution on your side? New schemas: should be on the user to add those to a catalog (🤔 is it possible to have multiple catalogs?). We just need to document it somewhere, and I don't think we necessarily need to handle that in the first release of neuxml. I forgot about the nested references - is it possible to collect them and store them in the catalog? How much effort? We don't need a 100% solution for the initial release, so let's figure out what is reasonable and good enough. |
|
Fascinating—it turns out that lxml actually calls a subprocess for parsing, which is where the network requests are actually originating from, so we can't intercept those using In the meantime, I think the best we can do is add the missing ones to the catalog and hope that whoever is adding more schemas follows the instructions to also add nested references to the catalog. |
@blms sounds reasonable to me! Thanks for figuring all this out. Anything we should document? (readme ? developer notes? ) |
|
@rlskoeser I left the skip on for the RDFLib network request since that seems like something that doesn't need to be done right away, and it's just one unit test. It fails if you remove the skip now thanks to I can go ahead and rewrite the readme documentation about the XML catalog stuff and rename |
|
@blms sounds good about the skip. This seems technical, lets put it in DEVNOTES for now until we know what needs to go in the readme. |
|
|
||
| Migration from ``eulxml`` | ||
| ------------------------- | ||
|
|
||
| After updating your project's dependencies to point at the new package name, | ||
| you can run this one-line shell script to find and replace every instance of | ||
| ``eulxml`` with ``neuxml`` in all ``.py`` files in the current working | ||
| directory and subdirectories. | ||
|
|
||
| On MacOS: | ||
|
|
||
| .. code-block:: shell | ||
|
|
||
| find . -name '*.py' -print0 | xargs -0 sed -i '' -e 's/eulxml/neuxml/g' | ||
|
|
||
|
|
||
| Or on other Unix-based operating systems: | ||
|
|
||
| .. code-block:: shell | ||
|
|
||
| find . -name '*.py' -print0 | xargs -0 sed -i 's/eulxml/neuxml/g' |
There was a problem hiding this comment.
This is great, thank you for adding it
In this PR
Per #1:
eulxmltoneuxmlformssubmodulepyproject.tomlPer #2:
pynoseas a drop-in compatibility replacement fornose(we will likely want to eventually replace withpytest)Per #5:
Questions
Draft state questions
develophere?Copyright 20xx Emory University LibrariesFor the PyPI Github action:
pyproject.tomlbefore setting it up, so we can use modern build tools for producing the output to upload to PyPI. I may need some help on this because of thecmdclassfunctions insetup.py. It seems like running that kind of code during build might be supported by Hatchling, have you used it in that way before?