Jobslib¶
Introduction¶
Jobslib is a library for launching Python tasks in parallel environment. Our use-case is. We have two datacenters (in near future three datacenters), in each datacenter is run server with some task. However only one task may be active at one time across all datacenters. Jobslib solves this problem.
Main features are:
Ancestor for class which holds configuration.
Ancestor for container for shared resources, e.g. database connection.
Ancestor for class with task.
Configurable either from configuration file or from environmet variables.
Liveness – mechanism for exporting informations about health state of the task. Jobslib includes implementation which uses Consul.
Metrics – mechanism for exporting metrics. Jobslib includes implementation which uses InfluxDB.
One Instance Lock – lock, which allowes only one running instance at the same time. Jobslib includes implementation which uses Consul.
Instalation¶
Installation from source code:
$ git clone https://github.com/seznam/jobslib.git
$ cd jobslib
$ python setup.py install
Installation from PyPi:
$ pip install jobslib
Tox is used for testing:
$ git clone https://github.com/seznam/jobslib.git
$ cd jobslib
$ pip install tox
$ tox --skip-missing-interpreters
Usage¶
Task is launched from command line using runjob command:
$ runjob [-s SETTINGS] [--disable-one-instance] [--run-once]
[--sleep-interval SLEEP_INTERVAL] [--run-interval RUN_INTERVAL]
[--keep-lock] [--release-on-error]
task_cls
$ # Pass settings module using -s argument
$ runjob -s myapp.settings myapp.task.HelloWorld --run-once
$ # Pass settings module using environment variable
$ export JOBSLIB_SETTINGS_MODULE="myapp.settings"
$ runjob myapp.task.HelloWorld --run-once
Task is normally run in infinite loop, delay in seconds between individual
launches is controlled by either --sleep-interval
or
--run-interval
argument. --sleep-interval
is interval in
seconds, which is used to sleep after task is done. --run-interval
tells that task is run every run interval seconds. Both arguments may not be
used together. --keep-lock
argument causes that lock will be kept
during sleeping, it is useful when you have several machines and you want to
keep the task still on the same machine. You can force release lock on error
with --release-on-error
if you use --keep-lock
.
If you don’t want to launch task
forever, use --run-once
argument. Library provides locking
mechanism for launching tasks on several machines and only one instance at
one time may be launched. If you don’t want this locking, use
--disable-one-instance
argument. All these options can be set in
settings
module. Optional argument -s/--settings
defines
Python module where configuration is stored. Or you can pass settings module
using JOBSLIB_SETTINGS_MODULE
environment variable.
During task initialization instances of the jobslib.Config
and
jobslib.Context
classes are created. You can define your own classes
in the settings
module. jobslib.Config
is a container which
holds configuration. jobslib.Context
is a container which holds
resources which are necessary for your task, for example database connection.
Finally, when both classes are successfuly initialized, instance of the task
(subclass of the jobslib.BaseTask
passed as a task_cls
argument) is created and launched.
If you want to write your own task, inherit jobslib.BaseTask
class
and override jobslib.BaseTask.task()
method:
import sys
from jobslib import BaseTask
class HelloWorld(BaseTask):
name = 'helloworld'
description = 'prints hello world'
def task(self):
sys.stdout.write('Hello World!\n')
sys.stdout.flush()
Configure your task in the settings
module:
ONE_INSTANCE = {
'backend': 'jobslib.oneinstance.dummy.DummyLock',
}
LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'formatters': {
'default': {
'format': '%(asctime)s %(name)s %(levelname)s %(message)s',
'datefmt': '%Y-%m-%d %H:%M:%S',
},
},
'handlers': {
'console': {
'class': 'logging.StreamHandler',
'level': 'NOTSET',
'formatter': 'default',
},
},
'root': {
'handlers': ['console'],
'level': 'INFO',
},
}
Optionally you can override jobslib.Config
and/or
jobslib.Context
. Finally run your task:
$ runjob -s helloworld.settings --run-once helloworld.task.HelloWorld
2020-07-03 14:53:25 helloworld.task.HelloWorld INFO Run task
Hello World!
2020-07-03 14:53:25 helloworld.task.HelloWorld INFO Task done
Reference manual¶
- Reference manual
Settings
– basic configuration of your applicationConfig
– container for configurationContext
– container for shared resourcesTask
– class which encapsulates taskLiveness
– informations about health state of the taskMetrics
– task metricsOne Instance Lock
– only one running instance at the same time
Source code and license¶
Source codes are available on GitHub https://github.com/seznam/jobslib under the 3-clause BSD license. Semantic Versioning and Keep a Changelog for changelog is used.