Skip to content

Refactor for common news endpoint#4195

Open
MizukiTemma wants to merge 13 commits intodevelopfrom
refactoring/common_news_endpoint
Open

Refactor for common news endpoint#4195
MizukiTemma wants to merge 13 commits intodevelopfrom
refactoring/common_news_endpoint

Conversation

@MizukiTemma
Copy link
Copy Markdown
Member

@MizukiTemma MizukiTemma commented Mar 16, 2026

Short description

This PR implements a common endpoint for news (push notifcation) and Tü News.

Proposed changes

  • Add a new endpoint all-news/ that delivers both PNs and posts of Tü News
  • Add source flag to distinguish PNs form Tü News posts
  • Keep the old endpoint fcm/,sent_push_notifications) not to break the app
  • Add a celery task to collect and save Tü news posts as cache
  • Add Tü News related models and management command to integrate functionalities of our Tü News repository
  • The current approach is based on that we will store Tü News posts in our Integreat CMS database and the memory cache solution proposed here comes later

Side effects

Faithfulness to issue description and design

There are no intended deviations from the issue and design.

  • Currently Tü News Bridge is supporting only German, English, Farsi and Arabic. This PR checkes Tü News posts for all languages that are available in the CMS system.

  • We probably need a new section in CMS where Service Team can decide for which language we fetch posts from Tü News (not implemented yet). Currently German, English, Farsi and Arabic are handled in our Tü News

How to test

  1. Use the command import_tuenews
  2. Check "Enable external news" in a region
  3. Try the new endpoint

❓ Does anyone have an idea how to test the social media header endpoints (news/<slug:news_type>/<slug:slug>/) ?
Self-answer from 24.04.2026: can be tested like this in the console:

image

Resolved issues

Fixes: #4174


Pull Request Review Guidelines

@steffenkleinle
Copy link
Copy Markdown
Member

German verbose names should probably be changed to English 🤔 Is there any harm @svenseeberg @steffenkleinle ?

What does that mean, can you give me some more info here please? What names are we talking about?

@MizukiTemma
Copy link
Copy Markdown
Member Author

German verbose names should probably be changed to English 🤔 Is there any harm @svenseeberg @steffenkleinle ?

What does that mean, can you give me some more info here please? What names are we talking about?

"Title", "Inhalt", "E-News Nummer", "Datum", "WP Post ID" for example

https://github.com/digitalfabrik/tunews/blob/66dcab37b28b0f4d9cbdbd2d91abdc7189ffa216/src/news/models/newsitem.py#L9-L19

I guess these aren't used in the app but want to be sure 😅 Maybe rather a question to @svenseeberg if these are being used somewhere.

@MizukiTemma MizukiTemma changed the title [WIP] Refactor for common news endpoint Refactor for common news endpoint Mar 16, 2026
@MizukiTemma MizukiTemma marked this pull request as ready for review March 16, 2026 16:58
@MizukiTemma MizukiTemma added deadline Needs to be fixed in the given time needs-reviewer labels Mar 16, 2026
Copy link
Copy Markdown
Contributor

@jonbulz jonbulz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't sit right with me. I don't think this is a good approach. I think there are 3 ways to handle this, all with their pros and cons:

  1. Store News in cache, not database: Have some kind of service fetch news form different sources and store them as JSON in our redis cache. Then have our API simply fetch the cached news from external sources in addition to our internal PushNotifications
  2. Properly store the external news in our database by transforming them to PushNotifications (with some kind of source flag for clarity)
  3. If the deadline is urgent, copy-paste the code from tunews as a Django app into our repository and dump the content from the existing database. IMO, this only makes sense as an indermediate step, i.e. the code should be removed at some point once we implemented one of the above solutions. this is much easier if we keep it as a separate app.

Introducing a new NewsLanguage model in our cms app feels like a bad idea to me. This is how you get a messy database. No one wants a messy database

title=post["title"],
language_code=language.bcp47_tag,
excerpt=post["content"],
url=post["link"],
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steffenkleinle

We don't have get_absolute_url() for Tü News posts as they are not part of our contents. Is it acceptable to substitute url with link flag for Tü News posts? (examples of Tü News)

@MizukiTemma MizukiTemma requested a review from jonbulz April 2, 2026 20:34
@MizukiTemma
Copy link
Copy Markdown
Member Author

@jonbulz
Thank you for detailed suggestions and explanation 😍 I hope it's now implemented better 💪

Copy link
Copy Markdown
Contributor

@jonbulz jonbulz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this topic, this is a solid start! I do have some concerns, though:

Architecture / Extensibility

We already plan to add a third news source (Amal news) in the near future. There may be more to come after that. With this pattern, we have growing if/elif chains in several places. Also, the code for external news lives in push_notifications.py, which is a bit confusing. I think even though we decided against a database model, this deserves its own module and class. I don't want to be too specific with the proposed solution, but if you want, we can discuss this in a meeting.

Resilience

There are several possible unhandled exceptions in the code, see my additional comments. I think we need to be a bit more defensive when dealing with content from external sources. A malformed response should not crash our process.

Other Bugs

There are some other bugs I noticed, e.g. the duplicate URL name, see my comments. Additionally, I think we are still lacking pagination, and the results are not sorted again after tuenews have been added. I'd also prefer if we stick to one naming convention tuenews or tue_news :)

Comment thread integreat_cms/core/management/commands/import_tuenews.py Outdated
Comment thread integreat_cms/api/v3/social_media_headers.py Outdated
Comment thread integreat_cms/api/v3/social_media_headers.py Outdated
Comment thread integreat_cms/api/urls.py Outdated
Comment thread integreat_cms/api/v3/push_notifications.py Outdated
Comment thread integreat_cms/core/management/commands/import_tuenews.py Outdated
Comment thread integreat_cms/core/management/commands/import_tuenews.py Outdated
Comment thread integreat_cms/core/management/commands/import_tuenews.py Outdated
Comment thread integreat_cms/core/management/commands/import_tuenews.py Outdated
@MizukiTemma MizukiTemma removed the deadline Needs to be fixed in the given time label Apr 13, 2026
@MizukiTemma
Copy link
Copy Markdown
Member Author

@jonbulz
Thank you for re-review 😸 Beside your code suggestions, the following changes are applied:

  • Rename api/v3/push_notifications.py to api/v3/news.py
  • Stick to tuenews
  • Add sorting to all_news endpoint

We already plan to add a third news source (Amal news) in the near future. There may be more to come after that. With this pattern, we have growing if/elif chains in several places. Also, the code for external news lives in push_notifications.py, which is a bit confusing. I think even though we decided against a database model, this deserves its own module and class. I don't want to be too specific with the proposed solution, but if you want, we can discuss this in a meeting.

I'm not sure a separate module reduces if/else and/or bring a clear intuitive structure 🤔 Maybe I'm not reaching what you imagine.

Additionally, I think we are still lacking pagination
What do you mean with "pagination"?

@MizukiTemma MizukiTemma requested a review from jonbulz April 13, 2026 11:58
@jonbulz
Copy link
Copy Markdown
Contributor

jonbulz commented Apr 21, 2026

What do you mean with "pagination"?

@steffenkleinle said in this comment:

Otherwise a simple endpoint that supports pagination is enough.

This endpoint always returns an array of all news items, not a paginated response. Supporting pagination means we have to accept additional request parameters, e.g. offset/limit or page/page_size and wrap the response accordingly, e.g.

Response:
  {
    "count": 137,
    "next": "https://…/all-news/?limit=50&offset=50&channel=all",
    "previous": null,
    "results": [ ...]
  }

Or is pagination not needed after all?

Comment thread integreat_cms/core/management/commands/import_tuenews.py Outdated
@steffenkleinle
Copy link
Copy Markdown
Member

Or is pagination not needed after all?

I guess this depends on the amount of items we are expecting. But would be good practice at least I guess :)

@hannaseithe hannaseithe self-requested a review April 21, 2026 11:15
@jonbulz
Copy link
Copy Markdown
Contributor

jonbulz commented Apr 21, 2026

Concerning

I'm not sure a separate module reduces if/else and/or bring a clear intuitive structure

What I mean is this: Imagine we want to add another news source. What we'd have to do now is to edit at least 4 files:

  • news.py to add the source to the result list (in the all_news endpoint)
  • social_media_headers.py to extend our if news_type == ... chain
  • we need a new celery task
  • we need a new management command

Each of those is a place where the source-specific logic is coupled to the general logic. If any of those steps is forgotten, it breaks the new source.
I think in the long run we'd be in a much better position if each source were a self-contained thing that the rest of the code could treat uniformly. So generating the result list in all_news would become something like:

items: list[NewsItem] = []
for source in NEWS_SOURCES:
    if source.is_enabled_for(region):
        items.extend(source.get_items_list(...))

or similar. I'd leave the exact implementation up to you, but I think that this abstraction would be very useful. I'm happy to discuss this further if you'd like :)

@MizukiTemma
Copy link
Copy Markdown
Member Author

What do you mean with "pagination"?

@steffenkleinle said in this comment:

Otherwise a simple endpoint that supports pagination is enough.

This endpoint always returns an array of all news items, not a paginated response. Supporting pagination means we have to accept additional request parameters, e.g. offset/limit or page/page_size and wrap the response accordingly, e.g.

Response:
  {
    "count": 137,
    "next": "https://…/all-news/?limit=50&offset=50&channel=all",
    "previous": null,
    "results": [ ...]
  }

Or is pagination not needed after all?

Thank you for explanations 😸 I got it what was meant 🙈

I'm but not sure whether the pagination works well with cliend-side filtering. Our Response contains both push nofitication news and Tü News posts. The app filters them by source flag. Shouldn't they already filtered on the CMS side to be pagination enability?

@MizukiTemma
Copy link
Copy Markdown
Member Author

Concerning

I'm not sure a separate module reduces if/else and/or bring a clear intuitive structure

What I mean is this: Imagine we want to add another news source. What we'd have to do now is to edit at least 4 files:

  • news.py to add the source to the result list (in the all_news endpoint)
  • social_media_headers.py to extend our if news_type == ... chain
  • we need a new celery task
  • we need a new management command

Each of those is a place where the source-specific logic is coupled to the general logic. If any of those steps is forgotten, it breaks the new source. I think in the long run we'd be in a much better position if each source were a self-contained thing that the rest of the code could treat uniformly. So generating the result list in all_news would become something like:

items: list[NewsItem] = []
for source in NEWS_SOURCES:
    if source.is_enabled_for(region):
        items.extend(source.get_items_list(...))

or similar. I'd leave the exact implementation up to you, but I think that this abstraction would be very useful. I'm happy to discuss this further if you'd like :)

I see. Thank you for sharing your idea :)

I would like to suggest to build first an agreement in team how the it should be implemented exactly. Otherwise everyone imagines something diffenrent and this PR then results in a table of long discussion 😓 and also want to avoid this PR staying unmegred for a long time as App Team is waiting for us.

@MizukiTemma MizukiTemma requested a review from jonbulz April 24, 2026 13:26
@MizukiTemma
Copy link
Copy Markdown
Member Author

@jonbulz @hannaseithe
Thank you for the idea exchange meeting 😸 Here is my try with a new structure 🔧 It's now similar to MT providers.

@MizukiTemma MizukiTemma force-pushed the refactoring/common_news_endpoint branch from 31a0b61 to 09e5045 Compare April 24, 2026 14:14
@svenseeberg
Copy link
Copy Markdown
Member

I guess these aren't used in the app but want to be sure 😅 Maybe rather a question to @svenseeberg if these are being used somewhere.

Well, the TuNews Number is shown in each article: https://integreat.app/tuebingen/de/news/tu-news/2290

But you need to check if we render the number into the article or whether this is already part of the content. I think at some point we had the idea of creating a nice, structured footer with the structured data. But I think its likely that we never did that and never will do ;)

@steffenkleinle
Copy link
Copy Markdown
Member

I guess these aren't used in the app but want to be sure 😅 Maybe rather a question to @svenseeberg if these are being used somewhere.

Well, the TuNews Number is shown in each article: https://integreat.app/tuebingen/de/news/tu-news/2290

But you need to check if we render the number into the article or whether this is already part of the content. I think at some point we had the idea of creating a nice, structured footer with the structured data. But I think its likely that we never did that and never will do ;)

Yes, its already part of the content, we don't actively use enewsno atm.

Copy link
Copy Markdown
Contributor

@hannaseithe hannaseithe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Mizuki, thank you for working this. I do like that you implemented an abstraction layer for the news managers. This makes is much cleaner. In my review I have focussed on improvements on the abstraction and haven't gone much into detail in regards to specific implementations.

Comment thread integreat_cms/api/v3/news.py Outdated

from ...news_managers.abstract_news_manager import AbstractNewsManager

CHOICES: Final[list[tuple[str, type[AbstractNewsManager]]]] = [
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CHOICES: Final[list[tuple[str, type[AbstractNewsManager]]]] = [
CHOICES: Final[list[AbstractNewsManager]] = [

or

Suggested change
CHOICES: Final[list[tuple[str, type[AbstractNewsManager]]]] = [
CHOICES: Final[list[type[AbstractNewsManager]]] = [

Meaning:

  1. I would get rid of the str in the tuple and make that a name property on the News Manager,
  2. and either already save instances of the Manager in CHOICES (suggestion 1) or make the methods on the News Manager @classmethod s , so that we can follow a singleton(-ish) pattern here.

Reason for 1: The name of the manager should be encapsulated in the manager itself. It seems redundant/unnecessary to keep it as an external lookup key

Reason for 2: Instantiating a new manager every time we want to access a method seems not very clean, even if it works here because we do not have any instance state (yet).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I go for the suggestion 1 👍

Comment thread integreat_cms/news_managers/abstract_news_manager.py
Comment thread integreat_cms/api/v3/social_media_headers.py Outdated
@abstractmethod
def collect_news_items(
self, region_slug: str, language_slug: str, channel: str
) -> list[dict]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
) -> list[dict]:
) -> list[NewsItem]:

I would use typing here to validate/guarantee the implicit assumptions about what a NewsItem needs to look like


from ..cms.models import Language, Region


Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@dataclass
class NewsItem():
display_date: Any #we rely upon that existing in news.py

But if we actually want to guarantee more than that as a data structure in the API (which I assume we do), this should be expanded.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm 🤔 Do you mean we don't return dict anymore?


sorted_result = sorted(result, key=lambda i: i["display_date"], reverse=True)

return JsonResponse(sorted_result, safe=False)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a problem here with the fact that we are breaking our established pattern of using a transform function inside the endpoint which is our guarantor of the data strcuture that we expose on our API endpoint. Now we have two hidden transform functions that create different data structures and that one has no direct access to on the endpoint module.

I would expect to have a transform_news_item function that is applied to every element in sorted results. So that if I want to look up, which strucutret the endpoint follows, I would just take a look at the transform function (basically they fullfill the a similar function to the serializers in the django rest framework)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point, but then Tü News posts are every time transformed when the API is called. Instead they can be directly stored in the form they are delivered. And it's enough consisntent in my opinion if we put the transform functions for news items into respective news managers: the news managers then build an exception that their transform functions are not in the endpoint but they all have their transform function in them. In addition, jumping between a news manager and the endpoint only for this very last step breaks the stream of work.

This is very preference matter. I stick with the current version as long as no large motivation appears for separating transformation from the news managers into the endpoint.

Comment thread integreat_cms/news_managers/abstract_news_manager.py
Comment thread integreat_cms/news_managers/abstract_news_manager.py Outdated
@hannaseithe
Copy link
Copy Markdown
Contributor

One thing I forgot: I feel we should have a test for the new endpoint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provide common news endpoint / Refactor TüNews

5 participants