- Aug 25, 2017
-
-
Toby Lawrence authored
Allow instance group/instance fleet configuration to peacefully coexist.
-
Toby Lawrence authored
Due to how we use/used Ansible, the first iteration of instance fleet support was being hampered by the default behaviours of Ansible, and by some of the esoteric syntax we were using. We slightly changed our logic for detecting fleet vs group, which allows us to use a more crude-but-effective default value for this work.
-
- Aug 23, 2017
-
-
Toby Lawrence authored
Change Ansible to merge in extra vars rather than replace.
-
Toby Lawrence authored
Since we define our var overrides via extra vars with a dictionary, Ansible's default behaviour is to wholesale swap the dictionaries, seemingly. This causes an issue when you try to use one or the other of groups vs fleets. Due to the logic we have to figure out if the user is requesting an instance group-backed vs instance fleet-backed cluster, we would potentially launch with a default group-based setup (ignoring the fleet config) or with an erroneous string for the fleet config. Basically, it's broken either way since we can't run 99% of our jobs on an m1.medium. From a technical standpoint, though, the cluster could successfully launch and schedule Hadoop jobs. They would just inevitably fail from OOM at some point. This change properly brings instance_fleets back into the fold by passing it in, but changes the hash behaviour so that instance_groups/instance_fleets are always both defined, as they should be, with the correct defaults to ...
-
- Aug 22, 2017
-
-
Toby Lawrence authored
Adjust the default value for instance_fleets in the emr library.
-
Toby Lawrence authored
-
Toby Lawrence authored
-
- Aug 21, 2017
-
-
Hassan authored
Master instance fleet config.
-
- Aug 18, 2017
-
-
Hassan Javeed authored
-
- Aug 11, 2017
-
-
Toby Lawrence authored
[DE-276] Support for launching EMR clusters based on instance fleets.
-
Toby Lawrence authored
Launching EMR clusters based on instance fleet specifications allows us to specify a set of subnets, and a set of instance types, and opportunistically launch clusters in the best priced subnet with the best mix of instance types to meet the capacity demands.
-
- Jul 27, 2017
-
-
Toby Lawrence authored
Adds EbsConfiguration stanza to InstanceGroup specification
-
- Jun 29, 2017
-
-
Jillian Vogel authored
Triggered by specifying volume_type and/or volume_size * volume_type: default is gp2 * volume_size: default is 32 (GB)
-
- Feb 28, 2017
-
-
Toby Lawrence authored
Shorten the control path.
-
Toby Lawrence authored
-
- Feb 17, 2017
-
-
Toby Lawrence authored
Read the workflow logs before collecting metrics.
-
- Feb 16, 2017
-
-
Toby Lawrence authored
Some of the workflow jobs generate a lot of log output, and when we're trying to see if metrics were collected, etc, that stuff is buried somewhere in the middle of the "console output" for the job. We're just moving the metrics collection to the very end so that it's easier to find no matter how large the workflow's log output is.
-
- Feb 11, 2017
-
-
Toby Lawrence authored
Add some debugging to metrics collection, which is still tumultuous.
-
Toby Lawrence authored
As we're still seeing some weird SSH issues when connecting to the master node, we're now using -vvvv so we can get the full SSH output as we try to debug these problems. We're also invoking the playbook by appending "|| true", so that these SSH failures don't cause Jenkins to think the job failed. This is important because some tasks downstream will not run if their upstream requirement "failed" because of an SSH error when collecting the metrics. We're also spitting out the stdout from the collection script to console so that we have a better idea of what the script is doing, just in case that's part of the "SSH error."
-
- Feb 08, 2017
-
-
Toby Lawrence authored
Set the right SSH username to use when collecting metrics.
-
- Feb 07, 2017
-
-
Toby Lawrence authored
-
Toby Lawrence authored
[AN-8425] Collect Hadoop metrics after workflow completes
-
Toby Lawrence authored
-
Toby Lawrence authored
We've added a new Ansible playbook (playfile, really) that will execute the collection script for us, providing all the necessary bits to SSH to the proper node. We've also added a new Make target, which invokes said Ansible playbook, and we now run it after a workflow completes (after remote-task returns), before the cluster is terminated.
-
- Feb 04, 2017
-
-
Toby Lawrence authored
[AN-8425] Collect Hadoop metrics after workflow completes
-
Toby Lawrence authored
We've added a new Ansible playbook (playfile, really) that will execute the collection script for us, providing all the necessary bits to SSH to the proper node. We've also added a new Make target, which invokes said Ansible playbook, and we now run it after a workflow completes (after remote-task returns), before the cluster is terminated.
-
- Aug 30, 2016
-
-
Calen Pennington authored
* Add an OEP-2 compliant openedx.yaml file * Specify an owner
-
- Jun 07, 2016
-
-
Jillian Vogel authored
-
- May 07, 2016
-
-
Gabe Mulley authored
-
- Apr 28, 2016
-
-
brianhw authored
Pass --virtualenv-extra-args to remote-task.
-
- Apr 21, 2016
-
-
Gabe Mulley authored
* support additional master security groups * run make bootstrap instead of make install * allow region override via environment variable
-
- Apr 19, 2016
-
-
Gabe Mulley authored
-
- Apr 15, 2016
-
-
Matjaz Gregoric authored
-
- Apr 14, 2016
-
-
Brian Wilson authored
- Apr 13, 2016
-
-
Omar Khan authored
When using EMR release 4.4.0, use hadoop, ganglia, hive, pig, and sqoop from the EMR distribution.
- Apr 08, 2016
-
-
Jillian Vogel authored
Required to allow edx-analytics-configuration to use ansible 1.4.4 on boxes provisioned for use with jenkins analytics using ansible 1.9.3.
-
- Mar 22, 2016
-
-
Gabe Mulley authored
Revert "Upgrade to ansible version 1.9.3-rc1-edx."
-
Gabe Mulley authored
-