MHRD Projects Gitlab

Skip to content
Snippets Groups Projects
  1. Aug 25, 2017
  2. Aug 23, 2017
    • Toby Lawrence's avatar
      Merge pull request #53 from edx/tobz/instnce-fleet-ansible-param · 4c65838a
      Toby Lawrence authored
      Change Ansible to merge in extra vars rather than replace.
      4c65838a
    • Toby Lawrence's avatar
      Change Ansible to merge in extra vars rather than replace. · fae7ff32
      Toby Lawrence authored
      Since we define our var overrides via extra vars with a dictionary,
      Ansible's default behaviour is to wholesale swap the dictionaries,
      seemingly.  This causes an issue when you try to use one or the other of
      groups vs fleets.  Due to the logic we have to figure out if the user is
      requesting an instance group-backed vs instance fleet-backed cluster, we
      would potentially launch with a default group-based setup (ignoring the
      fleet config) or with an erroneous string for the fleet config.
      Basically, it's broken either way since we can't run 99% of our jobs on
      an m1.medium.  From a technical standpoint, though, the cluster could
      successfully launch and schedule Hadoop jobs.  They would just
      inevitably fail from OOM at some point.
      
      This change properly brings instance_fleets back into the fold by
      passing it in, but changes the hash behaviour so that
      instance_groups/instance_fleets are always both defined, as they should
      be, with the correct defaults to ...
      fae7ff32
  3. Aug 22, 2017
  4. Aug 21, 2017
  5. Aug 18, 2017
  6. Aug 11, 2017
  7. Jul 27, 2017
  8. Jun 29, 2017
  9. Feb 28, 2017
  10. Feb 17, 2017
  11. Feb 16, 2017
    • Toby Lawrence's avatar
      Read the workflow logs before collecting metrics. · bb653098
      Toby Lawrence authored
      Some of the workflow jobs generate a lot of log output, and when we're
      trying to see if metrics were collected, etc, that stuff is buried
      somewhere in the middle of the "console output" for the job.
      
      We're just moving the metrics collection to the very end so that it's
      easier to find no matter how large the workflow's log output is.
      bb653098
  12. Feb 11, 2017
    • Toby Lawrence's avatar
      Merge pull request #45 from edx/tobz/better-metrics · 858646fd
      Toby Lawrence authored
      Add some debugging to metrics collection, which is still tumultuous.
      0.2.12
      858646fd
    • Toby Lawrence's avatar
      Add some debugging to metrics collection, which is still tumultuous. · 1487f583
      Toby Lawrence authored
      As we're still seeing some weird SSH issues when connecting to the
      master node, we're now using -vvvv so we can get the full SSH output as
      we try to debug these problems.
      
      We're also invoking the playbook by appending "|| true", so that these
      SSH failures don't cause Jenkins to think the job failed.  This is
      important because some tasks downstream will not run if their upstream
      requirement "failed" because of an SSH error when collecting the
      metrics.
      
      We're also spitting out the stdout from the collection script to console
      so that we have a better idea of what the script is doing, just in case
      that's part of the "SSH error."
      1487f583
  13. Feb 08, 2017
  14. Feb 07, 2017
  15. Feb 04, 2017
  16. Aug 30, 2016
  17. Jun 07, 2016
  18. May 07, 2016
  19. Apr 28, 2016
  20. Apr 21, 2016
  21. Apr 19, 2016
  22. Apr 15, 2016
  23. Apr 14, 2016
  24. Apr 13, 2016
  25. Apr 08, 2016
  26. Mar 22, 2016