Skip to content

Add an argument to ipcluster plugin to specify the number of engines#547

Open
clearf wants to merge 3 commits intojtriley:developfrom
clearf:ipcluster-plugin
Open

Add an argument to ipcluster plugin to specify the number of engines#547
clearf wants to merge 3 commits intojtriley:developfrom
clearf:ipcluster-plugin

Conversation

@clearf
Copy link
Copy Markdown

@clearf clearf commented Aug 5, 2015

  • Add parameters MASTER_ENGINES and NODE_ENGINES for the ipcluster and ipclusterrestart plugins to allow for the specification of a certain number of engines on the master and nodes, respectively.
    • This is useful to run no calculations on the master (which can deplete resources and cause the master to hang), and
    • Use e.g., scikit-learn's native joblib support (e.g., using jobs=-1) to enable multiprocessing on the nodes and reduce the amount of message passing around machines in the cluster.

Also addresses #538

@cancan101
Copy link
Copy Markdown

Does this allow you to set master_engines to 0?

@cancan101
Copy link
Copy Markdown

I wrote some similar logic: develop...cancan101:develop#diff-4774c0a25748eaab7628c5b506730127 (sorry its intermingled with a couple other changes).

@clearf
Copy link
Copy Markdown
Author

clearf commented Aug 6, 2015

Yes, it allows you to set master engines to zero and set NODE_ENGINES as well.

I've had problems where my master has gotten hammered by calculations and/or memory (even with master_engines = num_processors-1 engines), and the whole cluster has timed out.

This PR is a pretty simple way of accomplishing that (and the parameters are respected for ipclusterrestart as well)

Comment thread starcluster/plugins/ipcluster.py Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if self.master_engines == 0 won't this branch be taken? i.e. using 0 or None to turn of the master will not work.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... this works in testing. Let me figure this out.

Oh, actually, at this point in the execution, this is the string '0', which evaluates to true. This is bad and confusing, though, so I will change to check for "None". Thanks.

@cancan101
Copy link
Copy Markdown

Also related to this PR: #538.

dantreiman added a commit to dantreiman/StarCluster that referenced this pull request Oct 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants