Skip to content
Snippets Groups Projects
  • Imran Rashid's avatar
    93cdb8a7
    [SPARK-8425][CORE] Application Level Blacklisting · 93cdb8a7
    Imran Rashid authored
    ## What changes were proposed in this pull request?
    
    This builds upon the blacklisting introduced in SPARK-17675 to add blacklisting of executors and nodes for an entire Spark application.  Resources are blacklisted based on tasks that fail, in tasksets that eventually complete successfully; they are automatically returned to the pool of active resources based on a timeout.  Full details are available in a design doc attached to the jira.
    ## How was this patch tested?
    
    Added unit tests, ran them via Jenkins, also ran a handful of them in a loop to check for flakiness.
    
    The added tests include:
    - verifying BlacklistTracker works correctly
    - verifying TaskSchedulerImpl interacts with BlacklistTracker correctly (via a mock BlacklistTracker)
    - an integration test for the entire scheduler with blacklisting in a few different scenarios
    
    Author: Imran Rashid <irashid@cloudera.com>
    Author: mwws <wei.mao@intel.com>
    
    Closes #14079 from squito/blacklist-SPARK-8425.
    93cdb8a7
    History
    [SPARK-8425][CORE] Application Level Blacklisting
    Imran Rashid authored
    ## What changes were proposed in this pull request?
    
    This builds upon the blacklisting introduced in SPARK-17675 to add blacklisting of executors and nodes for an entire Spark application.  Resources are blacklisted based on tasks that fail, in tasksets that eventually complete successfully; they are automatically returned to the pool of active resources based on a timeout.  Full details are available in a design doc attached to the jira.
    ## How was this patch tested?
    
    Added unit tests, ran them via Jenkins, also ran a handful of them in a loop to check for flakiness.
    
    The added tests include:
    - verifying BlacklistTracker works correctly
    - verifying TaskSchedulerImpl interacts with BlacklistTracker correctly (via a mock BlacklistTracker)
    - an integration test for the entire scheduler with blacklisting in a few different scenarios
    
    Author: Imran Rashid <irashid@cloudera.com>
    Author: mwws <wei.mao@intel.com>
    
    Closes #14079 from squito/blacklist-SPARK-8425.