Skip to content

Add configurable Docker hostname#356

Merged
hexylena merged 3 commits intogalaxyproject:devfrom
bgruening:add_configurable_host
Jun 17, 2015
Merged

Add configurable Docker hostname#356
hexylena merged 3 commits intogalaxyproject:devfrom
bgruening:add_configurable_host

Conversation

@bgruening
Copy link
Copy Markdown
Member

This PR will add a new configuration option to Interactive Environments - docker_host. With this option you can specify a different docker hostname. This is very useful if you only have a Docker client on your Galaxy host and your Docker daemon is running on somewhere else.

If this is ok, I will add it to RStudio as well.
Questionable point is if it's ok to initialize load_deploy_config() in the __init__.

ping @jmchilton and @erasche

@hexylena
Copy link
Copy Markdown
Member

  • I take it we would be accessing docker over HTTPs (i.e. https://docs.docker.com/articles/https/)
    • in that case I'd strongly recommend adding an option for SSL client keys
  • xref Allow for launching docker on remote hosts bgruening/galaxy-ipython#4
  • if we're launching containers on remote hosts, we need to consider the port issue. We're still running netstat locally to determine an acceptable port.
    • when we thought about no. 4, above, one mechanism considered was the jobs mechanism. Using actual Galaxy jobs to launch/manage images in a much cleaner way than could be done otherwise.
    • unfortunately that still doesn't get around the issue that we essentially need to do a small bit of bidirectional communication with the $docker_host. We need to contact it, and say "I want to launch this image with these options", at which point it'll tell us, the head node, "the image is launched on this port" and the proxy can add that route.
  • we're starting to gather a LOT of common boilerplate for IEs. Maybe we should have an interactive_environment.ini file in config/plugins/interactive_environments/ which would hold some of this. I imagine you could have a shared command line/this option/etc.

@bgruening
Copy link
Copy Markdown
Member Author

@erasche sure, there is a lot to do for IE's. I'm a little bit confused what this has to do with this PR.
Do you want this option to be hidden to protect users?

I think to most of these (known) problems pulsar might be a good solution. But this is better discussed elsewhere. Hopefully at GCC.

To which boilerplate are you referring? If there is any configuration that holds true for all IE's we should probably move this into galaxy.ini? I'm also fine with an interactive_environment.ini/xml where it makes sense. But this option is IE specific, imho. I would like to be able to have one VM for IPython containers and another one for RStudio ...

@jmchilton
Copy link
Copy Markdown
Member

The proxy works out of the box for this? That is fantastic - I demand a gold star or some sort of blue ribbon :).

Can we modify the docker build command to use the host argument - so it doesn't need to be replicated twice in the ini file?

@martenson
Copy link
Copy Markdown
Member

@jmchilton 🎀

@bgruening
Copy link
Copy Markdown
Member Author

🌟!!!

It's working here on our Freiburg Galaxy Server, with the mentioned disadvantages from Eric, but it's working :)

We could indeed use the Docker hostname in the build command.
But the argument looks like this: -H tcp://glxdk1:4243 So we would need more information than just the hostname.

@jmchilton
Copy link
Copy Markdown
Member

@bgruening I say we allow specfying a docker host then instead of proxy hostname - and pull the destination hostname from that?

>>> import urlparse
>>> urlparse.urlparse("tcp://glxdk1:4243").hostname
'glxdk1'

How does that sound? I am still trying to get away from the IEs and deployers building command-line pieces.

@bgruening
Copy link
Copy Markdown
Member Author

Ok, sounds good. If not set, we simply do not set the -H parameter.
Is there any situation where we need to set the proxy hostname but not the Docker host?

@hexylena
Copy link
Copy Markdown
Member

I'm +0 on this right now...if you aren't running ANYTHING other than docker on your glxdk1, then it's fine as-is (though I'd really like to see SSL client cert options added, I am NOT running docker containers on a remote host with no authentication).

sure, there is a lot to do for IE's. I'm a little bit confused what this has to do with this PR. Do you want this option to be hidden to protect users?

No, not hidden. I'm just slightly worried this supports an impractical deployment method. You're essentially requiring that glxdk1 have free ports in the range used, otherwise images could get launched, the glxdk1 would start+fail them, and the user would have a bad experience.

This is further "bad" if you start ramping up usage, and two docker containers randomly get spawned with the same port, because the host says it's free. Two IPython containers, started at different times, could generate the same port #. The host would say it's free, it's not on the docker host.

I think to most of these (known) problems pulsar might be a good solution. But this is better discussed elsewhere. Hopefully at GCC.

yeah, agreed. There's a lot to be done here, from fixing the proxy, to better methods for deploying/running these jobs.

To which boilerplate are you referring? If there is any configuration that holds true for all IE's we should probably move this into galaxy.ini? I'm also fine with an interactive_environment.ini/xml where it makes sense. But this option is IE specific, imho. I would like to be able to have one VM for IPython containers and another one for RStudio ...

And if I were to deploy them, I might want all of my docker images on a single VM. We should inherit a global IE configuration, and override it with IE specific settings. It would be nice if the following were standard, and then IE overridable:

  • command
  • apache_urls (eventually going away with proxy)
  • password_auth (default false, maybe remove this option altogether)
  • ssl (see above)
  • command_inject
  • galaxy_url
  • docker_host

So... really all of them. Our IE specific config files would be reduced to:

[docker]
image = bgruening/docker-ipython-notebook:dev

stealth edit: password_auth defaults to false, as does apache_urls and ssl with the nodejs proxy

@hexylena
Copy link
Copy Markdown
Member

@bgruening I can make the configuration options more generic in another PR. I don't want to stop your progress...just worried about

  • ports
  • SSL cert option

@bgruening
Copy link
Copy Markdown
Member Author

@jmchilton a Docker hostname can have the following form: unix:///var/run/docker.sock
So either we only support tcp:// connections or we need to have two variables docker_hostname and docker_bind.

@jmchilton
Copy link
Copy Markdown
Member

@bgruening Okay - fair point - I guess we need two variables. Though you called this docker_host instead of docker_hostname in the PR - the hostname is probably right?

So I think this should be merge with changing this to hostname - and then a follow up pull request should be issued at some point by someone (maybe me) that adds the host binding support and SSL options - all of which should be taken care of at the framework level instead of exposing this to the IEs and in the command line tweak flag.

To address more of @erasche concerns - I wonder if the config/galaxy.ini should be able to override defaults for any of these.

Something like ie_default_docker_hostname - could serve as the default for [docker]hostname - but that value could be overridden on a per-IE basis still.

@jmchilton
Copy link
Copy Markdown
Member

Also @erasche - it seems like this does address the ports thing right? I think un-shown in @bgruening example is that he is adding a -H flag to the docker command field to communicate with a remote Docker host - so whatever port it will bind will be the correct one - and then he is feeding both the remote host and the port to the proxy. That same docker command could also specify the SSL certs FWIW - but I do think these things should all be parameterized and handled by the framework so we have flexibility to do cool things in the future (though arguably that then is more IE deployers need to learn).

@jmchilton
Copy link
Copy Markdown
Member

I am 👍 on this - @erasche are convinced yet it is a step in the right direction?

@bgruening
Copy link
Copy Markdown
Member Author

@jmchilton I made the name more explicit.
In general I like to move universal options into galaxy.ini e.g. ie_default_docker_hostname.

@hexylena
Copy link
Copy Markdown
Member

👍, let's get this merged.

@jmchilton

To address more of @erasche concerns - I wonder if the config/galaxy.ini should be able to override defaults for any of these.
Something like ie_default_docker_hostname - could serve as the default for [docker]hostname - but that value could be overridden on a per-IE basis still.

Sure, that would be fine, I'm pretty ambivalent where a "parent" IE configuration would go. Maybe we stick it in a file next to galaxy.ini? It's already pretty full...

Also @erasche - it seems like this does address the ports thing right? I think un-shown in @bgruening example is that he is adding a -H flag to the docker command field to communicate with a remote Docker host - so whatever port it will bind will be the correct one - and then he is feeding both the remote host and the port to the proxy.

Hmm, maybe I mis-communicated.

Say you have the following situation:

  1. You're using docker on a remote host
  2. You launch an IE, the netstat command says ports 8000-10000 are open
  3. The RNG picks port 8000, netstat agrees that it's free
  4. The IE is launched on the remote host
  5. You launch another IE, again, because that IE is listening on the remote host instead of localhost, netstat says ports 8000-10000 are open
  6. The RNG randomly picks port 8000 again, netstat agrees that it's free on localhost
  7. The IE is launched on the remote host and immediately dies because it can't bind port 8000.
  8. The user gets connected to someone else's IE (if it's of the same type). If it's not, it likely crashes.

There's nothing in the commit to address that. The only thing you could do is to check the session map/proxy and interact with it to get a list of used ports on the remote host, which would help save you.

That same docker command could also specify the SSL certs FWIW - but I do think these things should all be parameterized and handled by the framework so we have flexibility to do cool things in the future (though arguably that then is more IE deployers need to learn).

Okay, sure, fair enough. I guess the tl;dr was that I wanted to see more shared configuration

hexylena added a commit that referenced this pull request Jun 17, 2015
@hexylena hexylena merged commit 691b3d6 into galaxyproject:dev Jun 17, 2015
@jmchilton
Copy link
Copy Markdown
Member

@erasche Arg - you are right about the ports. I was thinking docker was picking them - I forgot we were.

@hexylena
Copy link
Copy Markdown
Member

@jmchilton after GCC I'll start planning infrastructure to support this, since it's definitely a priority for me. If we want any traction in enterprise-y environments, a separate docker host will be important.

@hexylena
Copy link
Copy Markdown
Member

Added a point 8, thought about it a bit more and had an unpleasant realisation:

The user gets connected to someone else's IE (if it's of the same type). If it's not, it maybe crashes/maybe connects.

this'll definitely happen because the proxy has auth'd the user for the specified remote host/port, if it's ipython they'll likely see a "page not found" and click the button and may be able to get back to the IE that someone else is using? Provisioned with the other user's API key/history IDs/etc.

@hexylena
Copy link
Copy Markdown
Member

If you were expiring routes in the nodejs session map quickly enough, I'd say let's place a unique constraint on that. However, I think routes are fairly long lived and you just take the most recent one (if memory serves)

@bgruening bgruening deleted the add_configurable_host branch June 17, 2015 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants