Jobs

Jobs are required to create a synchronization deployment. Unlinke Processes they are a persistent resource and work as a template for processes. Each job will trigger processes. A process defines when, how often and what endpoints get synchronized. Jobs trigger processes which are managed by the tubee task scheduler which is based on \TaskScheduler`. Jobs define what endpoints trigger parallel processes and in what order, they also define intervaly, retry levels and timeouts.

Create job

name: hourly
kind: Job
namespace: kzu
data:
  notification:
    enabled: false
    receiver: []
  collections:
  - ["accounts", "courses", "groups"]
  endpoints: 
  - mssql-import
  - mssql-relations
  - ["ldap-export", "balloon-export"]
  simulate: false
  log_level: debug
  ignore: true
  options:
    at: 0
    interval: 3600
    retry: 0
    retry_interval: 0
    timeout: 0

tubectl create -f spec.yaml

Check the just created resource:

tubectl get jobs hourly -n playground -o yaml

Parallelism

It is important to understand how jobs trigger processes and how they work in parallel using the maximum amount of resources (Nodes and cpu cores).

Synchron processes

To create a simple process order whereas endpoint named a should be processed first and as soon as it finishes it will trigger a second process for endpoint b, both for the collection named accounts.

data:
  collections: 
  - accounts
  endpoints:
  - a
  - b

This job configuration will trigger a total of three processes:

The main process
The sync process for the endpoint a
The sync process for the endpoint b

Note The main process is always finished as soon as all child processes were executed.

Parallel processes

To create parallel processes, one may specify a list of endpoints and/or collections:

data:
  collections: 
  - ['accounts', 'groups']
  endpoints:
  - a
  - b

This will trigger a total of 5 processes:

The main process
One process for accounts.a and one for groups.a at the same time
One process for accounts.b and one for groups.b

Simulation

A job can be entirely simulated by specify data.simulate to true. The default is false. While simulation is enabled, everything gets executed as usual but actions only get simulated. There will be no changes, neither on tubee nor on any endpoints.

Logging

By default jobs get executed within a log level error. This log level may be changed to one of:

emergency
critical
error
warning
notice
info
debug

Be very careful with low levels like debug. Low log levels have a massive impact on the performance and should only be used during initial testing and conifguration.

Continue on error

Normally a process terminates as soon as it encounters an exception. By setting data.ignore to true the processor will ignore such errors and continues with the next object. The default is false but it is usually safe and a good idea to set it to true.

Notification

A job might trigger mail notification as soon as it has been executed. Notification is disabled by default but may be enabled by setting data.notification.enabled to true. A notification may be sent to multiple receiver but at least one needs to be specified:

data:
  notification: 
    enabled: true
    receiver:
    - admin@example.org

Job timing

By default a jobs triggers only once and never again. Usually this not what is wanted. One may specify an interval time to let a job retrigger. It is also possible to set a specific time when the job should trigger the first time.

This setup will lead to an immediate trigger as soon as the job gets created and will retrigger every hour.

data:
  options:
    at: 0
    interval 3600

The option data.options.at is by default 0 which means immedieately but it may be changed to a unix timestamp. It will trigger a process at the time given.

Retry & Errors

If a job fails (or one of its processes), it may trigger a retry process. By default this mechanism is disabled but might be enabled by specifying a retry number data.options.retry. 2 would mean the process should get triggered up to two times if it fails. If a retry gets configured, it is best practics to define an interval, otherwise the time slot between failures might be too low that any issue was resolved in the meantime.

This example will trigger up to two times with an interval of 30min (three inclunding the first try):

data:
  options:
    retry: 2
    retry_interval: 1800

If data.options.ignore is true there are still some circumstandes whereas a process might fail, for example if an endpoint can not get initialized due network errors.

Timeouts

It is possible to configure a timeout data.options.timeout which is by default 0 (No timeout). Be careful with timeouts as they leave endpoints in incomplete conditions.