Skip to content

Add new practice exercise baffling-birthdays#1575

Merged
angelikatyborska merged 6 commits intomainfrom
jie-baffling-birthdays
Jun 15, 2025
Merged

Add new practice exercise baffling-birthdays#1575
angelikatyborska merged 6 commits intomainfrom
jie-baffling-birthdays

Conversation

@jiegillet
Copy link
Copy Markdown
Contributor

New exercise.

This one is a bit of a special one, and I might have gotten carried away with the math :D

It involves randomness, and therefore the tests have the potential to be flaky. I did the math to ensure that the flakiness is kept to a minimum (like one fail per 100k runs), but the safer I make the tests, the less strict they become.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 7, 2025

Thank you for contributing to exercism/elixir 💜 🎉. This is an automated PR comment 🤖 for the maintainers of this repository that helps with the PR review process. You can safely ignore it and wait for a maintainer to review your changes.

Based on the files changed in this PR, it would be good to pay attention to the following details when reviewing the PR:

  • General steps

    • 🏆 Does this PR need to receive a label with a reputation modifier (x:size/{tiny,small,medium,large,massive})? (A medium reputation amount is awarded by default, see docs)
  • Any exercise changed

    • 👤 Does the author of the PR need to be added as an author or contributor in <exercise>/.meta/config.json (see docs)?
    • 🔬 Do the analyzer and the analyzer comments exist for this exercise? Do they need to be changed?
    • 📜 Does the design file (<exercise>/.meta/design.md) need to be updated to document new implementation decisions?
  • Practice exercise changed

    • 🌲 Do prerequisites, practices, and difficulty in config.json need to be updated?
    • 🧑‍🏫 Are the changes in accordance with the community-wide problem specifiations?
  • Practice exercise tests changed

    • ⚪️ Are all tests except the first one skipped?
    • 📜 Does <exercise>/.meta/tests.toml need updating?

Automated comment created by PR Commenter 🤖.

Copy link
Copy Markdown
Member

@angelikatyborska angelikatyborska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with the testing approach. It's good that you left comments for students that tell them how different sample sizes behave in terms of flaky tests.

The tests with your implementation are very fast, but I think some students might go too far with sample sizes and have problems with timeouts 🤔

~D[2019-02-12]
]

output = BafflingBirthdays.shared_birthday?(birthdates) == true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and on LOC 50 are unused variables

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiegillet sorry, I should have been clearer - please make those expressions into assertions 😅 now it's:

    warning: use of operator == has no effect
    │
 63 │       BafflingBirthdays.shared_birthday?(birthdates) == true
    │                                                      ~
    │
    └─ test/baffling_birthdays_test.exs:63:54

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, don't apologize for me being an idiot 😆

Comment thread exercises/practice/baffling-birthdays/test/baffling_birthdays_test.exs Outdated

# for a sample size of 100 and this delta, the assertion is expected to fail once in 100 runs
# for a sample size of 600 and this delta, the assertion is expected to fail once in a billion runs
delta = 0.83
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you choose the deltas in the last few tests?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used this wolfram alpha app

  • enter the sample proportion (expected)
  • enter the confidence level (0.99, which means one failure in 100)
  • enter the sample size (100)

That gives you the 99% interval, meaning a random pick will fall in there 99% of the time. This interval goes from 0.03417 to 0.1997, which is 2 times the delta, that's how I got the value.

After that, I entered a confidence level of 0.999999999 (one failure in a billion) and tweaked the sample size until I found an interval that's fairly close to the confidence level of 0.99, that came out to being around 600 every time. It's not an exact calculation, but it should be close enough.

I used mix test --repeat-until-failure 100 with a sample of 100 to check if the solution would actually fail, and it does about half the time, which is expected, so it seems to work. I did not try a billion times with a sample of 600, but I ran it for a bit and never managed to make it fail 😆

I have one dark secret for this particular test though. If you try with a sample of 100, you will almost always fail the test, because with 100 sample, most of the time you will estimate 100 or 99, which is outside of the theoretical range (99.08 to 1). But I chose to ignore it, it's complicated to explain and it's unlikely that people will only pick a 100 sample size, and if they do, they'll see the comment and pick something bigger.

Comment on lines +130 to +141
expected_count_standard_deviation = fn
day when day <= 28 -> :math.sqrt(group_size * 12 / 365 * (1 - 12 / 365))
day when day <= 30 -> :math.sqrt(group_size * 11 / 365 * (1 - 11 / 365))
day when day == 31 -> :math.sqrt(group_size * 7 / 365 * (1 - 7 / 365))
end

counts_outside_95_percent_confidence_interval =
day_frequencies
|> Enum.filter(fn {day, count} ->
abs(count - expected_count.(day)) > 1.96 * expected_count_standard_deviation.(day)
end)
|> length()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there maybe a website that you could link in a comment in this code that could explain what's going on? My statistics knowledge is very rusty

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a log going on, that's what I meant when I said that maybe I got carried away lol

There are many concepts involved:

I was wondering if I should explain more each step, but it seems like too much, when solving the exercise is actually pretty easy.
What do you think?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... yes, that is way too much. I won't even pretend I understand 🙈

I see a few potential problems with this testing code:

Still, I'm pretty impressed with the effort you went through and if I understood it, I'm sure I would agree it's the only reasonable way to write a reliable test for random values 🤓 I'm fine just accepting that I don't get it and let it be merged.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not have done this if it wasn't fun :)

I actually agree with your list of problems, I share your concerns. I looked at Erik's test, my tests include his (assert map_size(month_frequencies) == number_of_months), but yes, I am fine with future maintainers simplifying the tests if I get hit by a bus. Or if this causes more confusion than it's worth.

@spec random_birthdates(group_size :: integer()) :: [Date.t()]
def random_birthdates(group_size) do
for _ <- 1..group_size do
year = generate_non_leap_year_january_first(0, 3000)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting year range 😬 just so that we're clear: it's completely unnecessary to make the year random, right? To pass the tests, you might just as well choose the same non-leap year for all birthdates.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as far as the tests are concerned you could pick year = 2025 and be done with it, that's true.
In terms of generating random Date.t(), this felt more natural to me. I guess it doesn't really matter?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right - It does not matter! I just wanted to make sure 🙂

Comment thread config.json Outdated
Comment on lines +2349 to +2352
"randomness",
"dates-and-time",
"lists",
"enum"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have also used: list-comprehensions, ranges, tuples. I think list-comprehensions might be optional (you're not using cartesian products, right?), but ranges might be necessary for generating a list of a given length.

Additionally a MapSet, but we don't have a concept for that 🤷

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I added them all, it could help to know the concepts.

Additionally a MapSet, but we don't have a concept for that 🤷

I made a set concept for Elm a while ago. Want me to fork it? :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing, if you have the time, it would be very nice to have that concept in Elixir too

@jiegillet
Copy link
Copy Markdown
Contributor Author

jiegillet commented Jun 13, 2025

The tests with your implementation are very fast, but I think some students might go too far with sample sizes and have problems with timeouts 🤔

I did some test by changing my sample size:

  • 1M: 40s
  • 100k: 4s
  • 10k: 0.4s

The cutoff is 10 seconds right? That's a limit of 250k samples. That doesn't seem too bad, especially considering that the tests are kind of hinting that a sample of 600 would be fine.

@angelikatyborska
Copy link
Copy Markdown
Member

The cutoff is 10 seconds right?

I somehow remember it's 2x or 3x the average test runtime set in our config file ("average_run_time": 4) but maybe it's just simple 10s. Anyway similar values (8s, 10s, or 12s)

especially considering that the tests are kind of hinting that a sample of 600 would be fine.

Agreed, I'm no longer worried about test speed. Thanks!

@angelikatyborska angelikatyborska merged commit df0b279 into main Jun 15, 2025
9 checks passed
@angelikatyborska angelikatyborska deleted the jie-baffling-birthdays branch June 15, 2025 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants