Skip to content
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
780b89e
4990 first past instance creation
Sep 10, 2018
884ae90
check for existence of security group before creating it #4990
pdurbin Sep 11, 2018
8db31c9
4990 working script first pass
Sep 11, 2018
9d04a2b
#4990 clean up create and delete all PoC
Sep 11, 2018
15e4454
Ansible pre-setup and larger ec2 instance #4990
Sep 14, 2018
510a390
Add 8080 security group ec2 #4990
Sep 14, 2018
a35c847
pass branch to ansible correct #4990
Sep 14, 2018
cde128a
Deploy branch ec2-ansible #4990
Sep 17, 2018
490d941
Add line about terminating instance #4990
Sep 17, 2018
6ce1215
Merge branch 'develop' into 4990-ec2-ansible-scripting
Sep 17, 2018
7bc6af8
Add fixme #4990
Sep 17, 2018
19e2915
comment cleanup #4990
Sep 17, 2018
6570aea
Ignore *.pem #4990
Sep 18, 2018
1d1794c
consistency: 2 spaces for indentation
pdurbin Sep 18, 2018
7052fdf
fix typo, remove cruft #4990
pdurbin Sep 18, 2018
0ae6dc1
document script and exit early if aws is not installed #4990
pdurbin Sep 19, 2018
6c9dec0
add script to list instances #4990
pdurbin Sep 19, 2018
5a91095
print multiple lines of output and put DNS last #4990
pdurbin Sep 19, 2018
42c7391
create key pair per instance #4990
pdurbin Sep 20, 2018
36df01f
open ports 80 and 443, print link #4990
pdurbin Sep 20, 2018
9dbf2be
put port 8080 in clickable link to avoid browser warnings #4990
pdurbin Sep 20, 2018
be995cc
replace sed command with --extra-vars arg #4990
pdurbin Sep 20, 2018
5c053a6
support non-IQSS repos #4990
pdurbin Sep 21, 2018
713a896
switch to non-nested extra vars #4990
pdurbin Sep 21, 2018
7317998
cleanup #4990
pdurbin Sep 21, 2018
a3f9078
Doc fix and remove unneeded install #4990
Sep 21, 2018
5ed3eb1
15 minutes and add back epel #4990
Sep 21, 2018
59c158a
Install epel-release before, add comment #4990
Sep 21, 2018
7021ae1
not the spin up script from Installation Guide #4990
pdurbin Sep 24, 2018
562c9c7
provide guidance on $PATH #4990
pdurbin Sep 28, 2018
746eec5
no ".txt" in aws config files, link to configure docs #4990
pdurbin Sep 28, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,4 @@ conf/docker-aio/dv/install/dvinstall.zip
# or copy of test data
conf/docker-aio/testdata/
scripts/installer/default.config
*.pem
16 changes: 15 additions & 1 deletion doc/sphinx-guides/source/developers/coding-style.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,20 @@ If you just downloaded Netbeans and are using the out-of-the-box settings, you s

If you know of a way to easily share Netbeans configuration across a team, please get in touch.

Bash
----

Generally, Google's Shell Style Guide at https://google.github.io/styleguide/shell.xml seems to have good advice.

Formatting Code
~~~~~~~~~~~~~~~

Tabs vs. Spaces
^^^^^^^^^^^^^^^

Don't use tabs. Use 2 spaces.
Copy link
Copy Markdown
Contributor

@poikilotherm poikilotherm Sep 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not very consistent throughout the codebase.

Are you guys aware of Checkstyle Maven plugin for checking and maybe enforcing this kind of stuff throughout the dev team (and helping QA)?
Example config I used for Peerpub

(If you want, I can create a separate issue for talking about things like CheckStyle, PMD + CPD and FindBugs SpotBugs)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poikilotherm I know! Help! Actually, can you please create an issue just about CheckStyle? To help us with consistent formatting like tabs vs. spaces? We try to work in small chunks and in the future you'd be welcome to create issues for FindBugs and other tools too. If you want, you can write the issue title in the form of a user story with something like "As a developer, I'd like some tooling to report on if the code I'm writing meets the coding style of the project such as tabs vs. spaces." Whatever wording makes sense to you. Then in the issue we can discuss CheckStyle vs. other tools. Again, the point is to have the smallest chunk possible. Maybe just the tabs vs. spaces thing. I hope this makes sense!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poikilotherm thanks for opening #5075 about code style.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-referencing: have a look at #5094 😄


shfmt from https://github.com/mvdan/sh seems like a decent way to enforce indentation of two spaces (i.e. ``shfmt -i 2 -w path/to/script.sh``) but be aware that it makes other changes.

Bike Shedding
-------------
Expand All @@ -103,4 +117,4 @@ Come debate with us about coding style in this Google doc that has public commen

----

Previous: :doc:`debugging` | Next: :doc:`containers`
Previous: :doc:`debugging` | Next: :doc:`deployment`
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/developers/containers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -410,4 +410,4 @@ Again, Dataverse Docker images on Docker Hub are highly experimental at this poi

----

Previous: :doc:`coding-style` | Next: :doc:`making-releases`
Previous: :doc:`deployment` | Next: :doc:`making-releases`
72 changes: 72 additions & 0 deletions doc/sphinx-guides/source/developers/deployment.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
==========
Deployment
==========

Developers often only deploy Dataverse to their :doc:`dev-environment` but it can be useful to deploy Dataverse to cloud services such as Amazon Web Services (AWS).

.. contents:: |toctitle|
:local:

Deploying Dataverse to Amazon Web Services (AWS)
------------------------------------------------

We have written scripts to deploy Dataverse to Amazon Web Services (AWS) but they require some setup.

Install AWS CLI
~~~~~~~~~~~~~~~

First, you need to have AWS Command Line Interface (AWS CLI) installed, which is called ``aws`` in your terminal. Launching your terminal and running the following command to print out the version of AWS CLI will tell you if it is installed or not:

``aws --version``
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be helpful to mention a known-working version of the AWS CLI (which may make things easier if a future update changes argument formatting used by the EC2 scripts).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioning a known good version isn't a bad idea but I believe the idea is that we'll be using this script fairly regularly so we should notice if it stops working.


If you have not yet installed AWS CLI you should install it by following the instructions at https://docs.aws.amazon.com/cli/latest/userguide/installing.html . Afterwards, you should re-run the "version" command above to verify that AWS CLI has been properly installed.

Configure AWS CLI
~~~~~~~~~~~~~~~~~

Next you need to configure AWS CLI.

Create a ``.aws`` directory in your home directory (which is called ``~``) like this:

``mkdir ~/.aws``

Create a plain text file at ``~/.aws/config`` with the following content::

[default]
region = us-east-1

Please note that at this time the region must be set to "us-east-1" but in the future we could improve our scripts to support other regions.

Create a plain text file at ``~/.aws/credentials`` with the following content::

[default]
aws_access_key_id = XXXXXXXXXXXXXXXXXXXX
aws_secret_access_key = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Then update the file and replace the values for "aws_access_key_id" and "aws_secret_access_key" with your actual credentials by following the instructions at https://aws.amazon.com/blogs/security/wheres-my-secret-access-key/

Download and Run the "Create Instance" Script
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Once you have done the configuration above, you are ready to try running the "create instance" script to spin up Dataverse in AWS.

Download :download:`ec2-create-instance.sh <../../../../scripts/installer/ec2-create-instance.sh>` and put it somewhere reasonable. For the purpose of these instructions we'll assume it's in the "Downloads" directory in your home directory.

You need to decide which branch you'd like to deploy to AWS. Select a branch from https://github.com/IQSS/dataverse/branches/all such as "develop" and pass it to the script with ``-b`` as in the following example. (Branches such as "master" and "develop" are described in the :doc:`version-control` section.)

``bash ~/Downloads/ec2-create-instance.sh -b develop``

You must specify the branch with ``-b`` but you can also specify a non-IQSS git repo URL with ``-r`` as in the following example.

``bash ~/Downloads/ec2-create-instance.sh -b develop -r https://github.com/scholarsportal/dataverse.git``

Now you will need to wait around 15 minutes until the deployment is finished. Eventually, the output should tell you how to access the installation of Dataverse in a web browser or via ssh. It will also provide instructions on how to delete the instance when you are finished with it. Please be aware that AWS charges per minute for a running instance. You can also delete your instance from https://console.aws.amazon.com/console/home?region=us-east-1 .

Caveats
~~~~~~~

Please note that while the script should work fine on newish branches, older branches that have different dependencies such as an older version of Solr are now expected to yield a working Dataverse installation. Your mileage may vary.

----

Previous: :doc:`coding-style` | Next: :doc:`containers`
1 change: 1 addition & 0 deletions doc/sphinx-guides/source/developers/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ Developer Guide
documentation
debugging
coding-style
deployment
containers
making-releases
tools
Expand Down
115 changes: 115 additions & 0 deletions scripts/installer/ec2-create-instance.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
#!/bin/bash
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be slightly more portable to use #!/usr/bin/env bash instead of #!/bin/bash, although for how the documentation says to use these scripts that won't matter (and in practice, for CentOS and OS X it is in /bin/).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure where I picked up /bin/bash as the path but it always seems to work for me. I'm not aware of when I'd need to use env instead. Maybe if someone installed a newer version of bash? But I'm targeting the system bash.


# For docs, see the "Deployment" page in the Dev Guide.

SUGGESTED_REPO_URL='https://github.com/IQSS/dataverse.git'
SUGGESTED_BRANCH='develop'

usage() {
echo "Usage: $0 -r $REPO_URL -b $SUGGESTED_BRANCH" 1>&2
exit 1
}

REPO_URL=$SUGGESTED_REPO_URL

while getopts ":r:b:" o; do
case "${o}" in
r)
REPO_URL=${OPTARG}
;;
b)
BRANCH_NAME=${OPTARG}
;;
*)
usage
;;
esac
done

AWS_CLI_VERSION=$(aws --version)
if [[ "$?" -ne 0 ]]; then
echo 'The "aws" program could not be executed. Is it in your $PATH?'
exit 1
fi

if [ "$BRANCH_NAME" = "" ]; then
echo "No branch name provided. You could try adding \"-b $SUGGESTED_BRANCH\" or other branches listed at $SUGGESTED_REPO_URL"
usage
exit 1
fi

if [[ $(git ls-remote --heads $REPO_URL $BRANCH_NAME | wc -l) -eq 0 ]]; then
echo "Branch \"$BRANCH_NAME\" does not exist at $REPO_URL"
usage
exit 1
fi

SECURITY_GROUP='dataverse-sg'
GROUP_CHECK=$(aws ec2 describe-security-groups --group-name $SECURITY_GROUP)
if [[ "$?" -ne 0 ]]; then
echo "Creating security group \"$SECURITY_GROUP\"."
aws ec2 create-security-group --group-name $SECURITY_GROUP --description "security group for Dataverse"
aws ec2 authorize-security-group-ingress --group-name $SECURITY_GROUP --protocol tcp --port 22 --cidr 0.0.0.0/0
aws ec2 authorize-security-group-ingress --group-name $SECURITY_GROUP --protocol tcp --port 80 --cidr 0.0.0.0/0
aws ec2 authorize-security-group-ingress --group-name $SECURITY_GROUP --protocol tcp --port 443 --cidr 0.0.0.0/0
aws ec2 authorize-security-group-ingress --group-name $SECURITY_GROUP --protocol tcp --port 8080 --cidr 0.0.0.0/0
fi

RANDOM_STRING="$(uuidgen | cut -c-8)"
KEY_NAME="key-$USER-$RANDOM_STRING"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if adding the branch name to this would provide value as well. Not a blocker for this version though

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, I guess. I was trying to keep the keyname somewhat short. My thought was that we could always check the running instance to see what branch is, if necessary.


PRIVATE_KEY=$(aws ec2 create-key-pair --key-name $KEY_NAME --query 'KeyMaterial' --output text)
if [[ $PRIVATE_KEY == '-----BEGIN RSA PRIVATE KEY-----'* ]]; then
PEM_FILE="$KEY_NAME.pem"
printf -- "$PRIVATE_KEY" >$PEM_FILE
chmod 400 $PEM_FILE
echo "Your newly created private key file is \"$PEM_FILE\". Keep it secret. Keep it safe."
else
echo "Could not create key pair. Exiting."
exit 1
fi

# The AMI ID may change in the future and the way to look it up is with the
# following command, which takes a long time to run:
#
# aws ec2 describe-images --owners 'aws-marketplace' --filters 'Name=product-code,Values=aw0evgkw8e5c1q413zgy5pjce' --query 'sort_by(Images, &CreationDate)[-1].[ImageId]' --output 'text'
#
# To use this AMI, we subscribed to it from the AWS GUI.
# AMI IDs are specific to the region.
AMI_ID='ami-9887c6e7'
# Smaller than medium lead to Maven and Solr problems.
SIZE='t2.medium'
echo "Creating EC2 instance"
# TODO: Add some error checking for "ec2 run-instances".
INSTANCE_ID=$(aws ec2 run-instances --image-id $AMI_ID --security-groups $SECURITY_GROUP --count 1 --instance-type $SIZE --key-name $KEY_NAME --query 'Instances[0].InstanceId' --block-device-mappings '[ { "DeviceName": "/dev/sda1", "Ebs": { "DeleteOnTermination": true } } ]' | tr -d \")
echo "Instance ID: "$INSTANCE_ID
echo "End creating EC2 instance"

PUBLIC_DNS=$(aws ec2 describe-instances --instance-ids $INSTANCE_ID --query "Reservations[*].Instances[*].[PublicDnsName]" --output text)
PUBLIC_IP=$(aws ec2 describe-instances --instance-ids $INSTANCE_ID --query "Reservations[*].Instances[*].[PublicIpAddress]" --output text)

USER_AT_HOST="centos@${PUBLIC_DNS}"
echo "New instance created with ID \"$INSTANCE_ID\". To ssh into it:"
echo "ssh -i $PEM_FILE $USER_AT_HOST"

echo "Please wait at least 15 minutes while the branch \"$BRANCH_NAME\" from $REPO_URL is being deployed."
Copy link
Copy Markdown
Contributor Author

@matthew-a-dunlap matthew-a-dunlap Sep 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We say 5-10 in the documentation (EDIT: UPDATED)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this. More than 15 minutes is more realistic.


# epel-release is installed first to ensure the latest ansible is installed after
# TODO: Add some error checking for this ssh command.
ssh -T -i $PEM_FILE -o 'StrictHostKeyChecking no' -o 'UserKnownHostsFile=/dev/null' -o 'ConnectTimeout=300' $USER_AT_HOST <<EOF
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it might be a bad habit. Probably not worth investigating alternative ways to avoid the interactive prompts in this iteration, but might be something to keep in mind for the future.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I didn't question too much what's going on with this EOF HEREDOC. It works but I did leave a TODO saying that we should add some error checking at some point. I'm not sure of the best way to improve this and I'm definitely open to ideas.

sudo yum -y install epel-release
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I left epel-release in here from when I was trying to install older versions of ansible via pip. Pretty sure it can be removed.

sudo yum -y install git nano ansible
git clone https://github.com/IQSS/dataverse-ansible.git dataverse
export ANSIBLE_ROLES_PATH=.
ansible-playbook -i dataverse/inventory dataverse/dataverse.pb --connection=local --extra-vars "dataverse_branch=$BRANCH_NAME dataverse_repo=$REPO_URL"
EOF

# Port 8080 has been added because Ansible puts a redirect in place
# from HTTP to HTTPS and the cert is invalid (self-signed), forcing
# the user to click through browser warnings.
CLICKABLE_LINK="http://${PUBLIC_DNS}:8080"
echo "To ssh into the new instance:"
echo "ssh -i $PEM_FILE $USER_AT_HOST"
echo "Branch \"$BRANCH_NAME\" from $REPO_URL has been deployed to $CLICKABLE_LINK"
echo "When you are done, please terminate your instance with:"
echo "aws ec2 terminate-instances --instance-ids $INSTANCE_ID"
11 changes: 11 additions & 0 deletions scripts/installer/ec2-destroy-all.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash

#This script gets all the instances from ec2 and sends terminate to them
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "destroy" script is useful but a "list" or "read" script (I'm thinking CRUD) would be nice. "Give me a list of all the running instances and a command for each instance to destroy some or all of them"

#Its pretty basic and probably shouldn't be trusted at this point. Namely:
# - You can kill instances other people are using
# - It will try to kill instances that are already dead, which makes output hard to read
# - If it fails for some reason it's hard to tell the script didn't work right

INSTANCES=$(aws ec2 describe-instances --query 'Reservations[].Instances[].[InstanceId]' --output text)

aws ec2 terminate-instances --instance-ids $INSTANCES
9 changes: 9 additions & 0 deletions scripts/installer/ec2-list-all.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#!/bin/bash
# https://docs.aws.amazon.com/cli/latest/userguide/controlling-output.html
INSTANCES=$(aws ec2 describe-instances --query 'Reservations[].Instances[].[InstanceId,KeyName,State.Name,PublicDnsName]' --output text)
if [[ "$?" -ne 0 ]]; then
echo "Error listing instances."
exit 1
else
echo "$INSTANCES"
fi