[源码分析] 分布式任务队列 Celery 之 发送Task & AMQP
[源码分析] 分布式任务队列 Celery 之 发送Task & AMQP
0x00 摘要
Celery是一个简单、灵活且可靠的,处理大量消息的分布式系统,专注于实时处理的异步任务队列,同时也支持任务调度。
在之前的文章中,我们看到了关于Task的分析,本文我们重点看看在客户端如何发送Task,以及 Celery 的amqp对象如何使用。
在阅读之前,我们依然要提出几个问题,以此作为阅读时候的指引:
- 客户端启动时候,Celery 应用 和 用户自定义 Task 是如何生成的?
- Task 装饰器起到了什么作用?
- 发送 Task 时候,消息是如何组装的?
- 发送 Task 时候,采用什么媒介(模块)来发送?amqp?
- Task 发送出去之后,在 Redis 之中如何存储?
说明:在整理文章时,发现漏发了一篇,从而会影响大家阅读思路,特此补上,请大家谅解。
[源码解析] 并行分布式框架 Celery 之 worker 启动 (1)
[源码解析] 并行分布式框架 Celery 之 worker 启动 (2)
[源码解析] 分布式任务队列 Celery 之启动 Consumer
[源码解析] 并行分布式任务队列 Celery 之 Task是什么
[从源码学设计]celery 之 发送Task & AMQP 就是本文,从客户端角度讲解发送Task
[源码解析] 并行分布式任务队列 Celery 之 消费动态流程 下一篇文章从服务端角度讲解收到 Task 如何消费
[源码解析] 并行分布式任务队列 Celery 之 多进程模型
0x01 示例代码
我们首先给出示例代码。
1.1 服务端
示例代码服务端如下,这里使用了装饰器来包装待执行任务。
from celery import Celery
app = Celery('myTest', broker='redis://localhost:6379')
@app.task
def add(x,y):
return x+y
if __name__ == '__main__':
app.worker_main(argv=['worker'])
1.2 客户端
客户端发送代码如下,就是调用 add Task 来做加法计算:
from myTest import add
re = add.apply_async((2,17))
我们开始具体介绍,以下均是客户端的执行序列。
0x02 系统启动
我们首先要介绍 在客户端,Celery 系统和 task(实例) 是如何启动的。
2.1 产生Celery
如下代码首先会执行 myTest 这个 Celery。
app = Celery('myTest', broker='redis://localhost:6379')
2.2 task 装饰器
Celery 使用了装饰器来包装待执行任务(因为各种语言的类似概念,在本文中可能会混用装饰器或者注解这两个术语)
@app.task
def add(x,y):
return x+y
task这个装饰器具体执行其实就是返回 _create_task_cls
这个内部函数执行的结果。
这个函数返回一个Proxy,Proxy 在真正执行到的时候,会执行 _task_from_fun
。
_task_from_fun
的作用是:将该task添加到全局变量中,即 当调用 _task_from_fun
时会将该任务添加到app任务列表中,以此达到所有任务共享的目的。这样客户端才能知道这个 task。
def task(self, *args, **opts):
"""Decorator to create a task class out of any callable. """
if USING_EXECV and opts.get('lazy', True):
from . import shared_task
return shared_task(*args, lazy=False, **opts)
def inner_create_task_cls(shared=True, filter=None, lazy=True, **opts):
_filt = filter
def _create_task_cls(fun):
if shared:
def cons(app):
return app._task_from_fun(fun, **opts) # 将该task添加到全局变量中,当调用_task_from_fun时会将该任务添加到app任务列表中,以此达到所有任务共享的目的
cons.__name__ = fun.__name__
connect_on_app_finalize(cons)
if not lazy or self.finalized:
ret = self._task_from_fun(fun, **opts)
else:
# return a proxy object that evaluates on first use
ret = PromiseProxy(self._task_from_fun, (fun,), opts,
__doc__=fun.__doc__)
self._pending.append(ret)
if _filt:
return _filt(ret)
return ret
return _create_task_cls
if len(args) == 1:
if callable(args[0]):
return inner_create_task_cls(**opts)(*args) #执行在这里
return inner_create_task_cls(**opts)
我们具体分析下这个装饰器。
2.2.1 添加任务
在初始化过程中,为每个app添加该任务时,会调用到app._task_from_fun(fun, **options)
。
具体作用是:
- 判断各种参数配置;
- 动态创建task;
- 将任务添加到_tasks任务中;
- 用task的bind方法绑定相关属性到该实例上;
代码如下:
def _task_from_fun(self, fun, name=None, base=None, bind=False, **options):
name = name or self.gen_task_name(fun.__name__, fun.__module__) # 如果传入了名字则使用,否则就使用moudle name的形式
base = base or self.Task # 是否传入Task,否则用类自己的Task类 默认celery.app.task:Task
if name not in self._tasks: # 如果要加入的任务名称不再_tasks中
run = fun if bind else staticmethod(fun) # 是否bind该方法是则直接使用该方法,否则就置为静态方法
task = type(fun.__name__, (base,), dict({
'app': self, # 动态创建Task类实例
'name': name, # Task的name
'run': run, # task的run方法
'_decorated': True, # 是否装饰
'__doc__': fun.__doc__,
'__module__': fun.__module__,
'__header__': staticmethod(head_from_fun(fun, bound=bind)),
'__wrapped__': run}, **options))()
# for some reason __qualname__ cannot be set in type()
# so we have to set it here.
try:
task.__qualname__ = fun.__qualname__
except AttributeError:
pass
self._tasks[task.name] = task # 将任务添加到_tasks任务中
task.bind(self) # connects task to this app # 调用task的bind方法绑定相关属性到该实例上
add_autoretry_behaviour(task, **options)
else:
task = self._tasks[name]
return task
2.2.2 绑定
bind方法的作用是:绑定相关属性到该实例上,因为只知道 task 名字或者代码是不够的,还需要在运行时候拿到 task 的实例。
@classmethod
def bind(cls, app):
was_bound, cls.__bound__ = cls.__bound__, True
cls._app = app # 设置类的_app属性
conf = app.conf # 获取app的配置信息
cls._exec_options = None # clear option cache
if cls.typing is None:
cls.typing = app.strict_typing
for attr_name, config_name in cls.from_config: # 设置类中的默认值
if getattr(cls, attr_name, None) is None: # 如果获取该属性为空
setattr(cls, attr_name, conf[config_name]) # 使用app配置中的默认值
# decorate with annotations from config.
if not was_bound:
cls.annotate()
from celery.utils.threads import LocalStack
cls.request_stack = LocalStack() # 使用线程栈保存数据
# PeriodicTask uses this to add itself to the PeriodicTask schedule.
cls.on_bound(app)
return app
2.3 小结
至此,在客户端(使用者方),Celery 应用已经启动,一个task实例也已经生成,其属性都被绑定在实例上。
0x03 amqp类
在客户端调用 apply_async 的时候,会调用 app.send_task 来具体发送任务,其中用到 amqp,所以我们首先讲讲 amqp 类。
3.1 生成
在 send_task 之中有如下代码,就是:
def send_task(self, ....):
"""Send task by name.
"""
parent = have_parent = None
amqp = self.amqp # 此时生成
此时的 self 是 Celery 应用本身,具体内容我们打印出来看看,从下面我们可以看到 Celery 应用是什么样子。
self = {Celery} <Celery myTest at 0x1eeb5590488>
AsyncResult = {type} <class 'celery.result.AsyncResult'>
Beat = {type} <class 'celery.apps.beat.Beat'>
GroupResult = {type} <class 'celery.result.GroupResult'>
Pickler = {type} <class 'celery.app.utils.AppPickler'>
ResultSet = {type} <class 'celery.result.ResultSet'>
Task = {type} <class 'celery.app.task.Task'>
WorkController = {type} <class 'celery.worker.worker.WorkController'>
Worker = {type} <class 'celery.apps.worker.Worker'>
amqp = {AMQP} <celery.app.amqp.AMQP object at 0x000001EEB5884188>
amqp_cls = {str} 'celery.app.amqp:AMQP'
backend = {DisabledBackend} <celery.backends.base.DisabledBackend object at 0x000001EEB584E248>
clock = {LamportClock} 0
control = {Control} <celery.app.control.Control object at 0x000001EEB57B37C8>
events = {Events} <celery.app.events.Events object at 0x000001EEB56C7188>
loader = {AppLoader} <celery.loaders.app.AppLoader object at 0x000001EEB5705408>
main = {str} 'myTest'
pool = {ConnectionPool} <kombu.connection.ConnectionPool object at 0x000001EEB57A9688>
producer_pool = {ProducerPool} <kombu.pools.ProducerPool object at 0x000001EEB6297508>
registry_cls = {type} <class 'celery.app.registry.TaskRegistry'>
tasks = {TaskRegistry: 10} {'myTest.add': <@task: myTest.add of myTest at 0x1eeb5590488>, 'celery.accumulate': <@task: celery.accumulate of myTest at 0x1eeb5590488>, 'celery.chord_unlock': <@task: celery.chord_unlock of myTest at 0x1eeb5590488>, 'celery.chunks': <@task: celery.chunks of myTest at 0x1eeb5590488>, 'celery.backend_cleanup': <@task: celery.backend_cleanup of myTest at 0x1eeb5590488>, 'celery.group': <@task: celery.group of myTest at 0x1eeb5590488>, 'celery.map': <@task: celery.map of myTest at 0x1eeb5590488>, 'celery.chain': <@task: celery.chain of myTest at 0x1eeb5590488>, 'celery.starmap': <@task: celery.starmap of myTest at 0x1eeb5590488>, 'celery.chord': <@task: celery.chord of myTest at 0x1eeb5590488>}
堆栈为:
amqp, base.py:1205
__get__, objects.py:43
send_task, base.py:705
apply_async, task.py:565
<module>, myclient.py:4
为什么赋值语句就可以生成 amqp?是因为其被 cached_property 修饰。
使用 cached_property 修饰过的函数,就变成是对象的属性,该对象第一次引用该属性时,会调用函数,对象第二次引用该属性时就直接从词典中取了,即 Caches the return value of the get method on first call。
@cached_property
def amqp(self):
"""AMQP related functionality: :class:`~@amqp`."""
return instantiate(self.amqp_cls, app=self)
3.2 定义
AMQP类就是对amqp协议实现的再一次封装,在这里其实就是对 kombu 类的再一次封装。
class AMQP:
"""App AMQP API: app.amqp."""
Connection = Connection
Consumer = Consumer
Producer = Producer
#: compat alias to Connection
BrokerConnection = Connection
queues_cls = Queues
#: Cached and prepared routing table.
_rtable = None
#: Underlying producer pool instance automatically
#: set by the :attr:`producer_pool`.
_producer_pool = None
# Exchange class/function used when defining automatic queues.
# For example, you can use ``autoexchange = lambda n: None`` to use the
# AMQP default exchange: a shortcut to bypass routing
# and instead send directly to the queue named in the routing key.
autoexchange = None
具体内容我们打印出来看看,我们可以看到 amqp 是什么样子。
amqp = {AMQP}
BrokerConnection = {type} <class 'kombu.connection.Connection'>
Connection = {type} <class 'kombu.connection.Connection'>
Consumer = {type} <class 'kombu.messaging.Consumer'>
Producer = {type} <class 'kombu.messaging.Producer'>
app = {Celery} <Celery myTest at 0x252bd2903c8>
argsrepr_maxsize = {int} 1024
autoexchange = {NoneType} None
default_exchange = {Exchange} Exchange celery(direct)
default_queue = {Queue} <unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>
kwargsrepr_maxsize = {int} 1024
producer_pool = {ProducerPool} <kombu.pools.ProducerPool object at 0x00000252BDC8F408>
publisher_pool = {ProducerPool} <kombu.pools.ProducerPool object at 0x00000252BDC8F408>
queues = {Queues: 1} {'celery': <unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>}
queues_cls = {type} <class 'celery.app.amqp.Queues'>
router = {Router} <celery.app.routes.Router object at 0x00000252BDC6B248>
routes = {tuple: 0} ()
task_protocols = {dict: 2} {1: <bound method AMQP.as_task_v1 of <celery.app.amqp.AMQP object at 0x00000252BDC74148>>, 2: <bound method AMQP.as_task_v2 of <celery.app.amqp.AMQP object at 0x00000252BDC74148>>}
utc = {bool} True
_event_dispatcher = {EventDispatcher} <celery.events.dispatcher.EventDispatcher object at 0x00000252BE750348>
_producer_pool = {ProducerPool} <kombu.pools.ProducerPool object at 0x00000252BDC8F408>
_rtable = {tuple: 0} ()
具体逻辑如下:
+---------+
| Celery | +----------------------------+
| | | celery.app.amqp.AMQP |
| | | |
| | | |
| | | BrokerConnection +-----> kombu.connection.Connection
| | | |
| amqp+----->+ Connection +-----> kombu.connection.Connection
| | | |
+---------+ | Consumer +-----> kombu.messaging.Consumer
| |
| Producer +-----> kombu.messaging.Producer
| |
| producer_pool +-----> kombu.pools.ProducerPool
| |
| queues +-----> celery.app.amqp.Queues
| |
| router +-----> celery.app.routes.Router
+----------------------------+
0x04 发送Task
我们接着看看客户端如何发送task。
from myTest import add
re = add.apply_async((2,17))
总述下逻辑:
- Producer 初始化过程完成了连接用的内容,比如调用self.connect方法,到预定的Transport类中连接载体,并初始化Chanel,self.chanel = self.connection;
- 调用 Message 封装消息;
- Exchange 将 routing_key 转为 queue;
- 调用 amqp 发送消息;
- Channel 负责最终消息发布;
我们下面详细解读下。
4.1 apply_async in task
这里重要的是两点:
- 如果是 task_always_eager,则产生一个 Kombu . producer;
- 否则,调用 amqp 来发送 task(我们主要看这里);
缩减版代码如下:
def apply_async(self, args=None, kwargs=None, task_id=None, producer=None,
link=None, link_error=None, shadow=None, **options):
"""Apply tasks asynchronously by sending a message.
"""
preopts = self._get_exec_options()
options = dict(preopts, **options) if options else preopts
app = self._get_app()
if app.conf.task_always_eager:
# 获取 producer
with app.producer_or_acquire(producer) as eager_producer:
serializer = options.get('serializer')
body = args, kwargs
content_type, content_encoding, data = serialization.dumps(
body, serializer,
)
args, kwargs = serialization.loads(
data, content_type, content_encoding,
accept=[content_type]
)
with denied_join_result():
return self.apply(args, kwargs, task_id=task_id or uuid(),
link=link, link_error=link_error, **options)
else:
return app.send_task( #调用到这里
self.name, args, kwargs, task_id=task_id, producer=producer,
link=link, link_error=link_error, result_cls=self.AsyncResult,
shadow=shadow, task_type=self,
**options
)
此时如下:
1 apply_async +-------------------+
| |
User +---------------------> | task: myTest.add |
| |
+-------------------+
4.2 send_task
此函数作用是生成任务信息,调用amqp发送任务:
- 获取amqp实例;
- 设置任务id,如果没有传入则生成任务id;
- 生成路由值,如果没有则使用amqp的router;
- 生成route信息;
- 生成任务信息;
- 如果有连接则生成生产者;
- 发送任务消息;
- 生成异步任务实例;
- 返回结果;
具体如下:
def send_task(self, name, ...):
"""Send task by name.
"""
parent = have_parent = None
amqp = self.amqp # 获取amqp实例
task_id = task_id or uuid() # 设置任务id,如果没有传入则生成任务id
producer = producer or publisher # XXX compat # 生成这
router = router or amqp.router # 路由值,如果没有则使用amqp的router
options = router.route(
options, route_name or name, args, kwargs, task_type) # 生成route信息
message = amqp.create_task_message( # 生成任务信息
task_id, name, args, kwargs, countdown, eta, group_id, group_index,
expires, retries, chord,
maybe_list(link), maybe_list(link_error),
reply_to or self.thread_oid, time_limit, soft_time_limit,
self.conf.task_send_sent_event,
root_id, parent_id, shadow, chain,
argsrepr=options.get('argsrepr'),
kwargsrepr=options.get('kwargsrepr'),
)
if connection:
producer = amqp.Producer(connection) # 如果有连接则生成生产者
with self.producer_or_acquire(producer) as P:
with P.connection._reraise_as_library_errors():
self.backend.on_task_call(P, task_id)
amqp.send_task_message(P, name, message, **options) # 发送任务消息
result = (result_cls or self.AsyncResult)(task_id) # 生成异步任务实例
if add_to_parent:
if not have_parent:
parent, have_parent = self.current_worker_task, True
if parent:
parent.add_trail(result)
return result # 返回结果
此时如下:
1 apply_async +-------------------+
| |
User +---------------------> | task: myTest.add |
| |
+--------+----------+
|
|
2 send_task |
|
v
+------+--------+
| Celery myTest |
| |
+------+--------+
|
|
3 send_task_message |
|
v
+-------+---------+
| amqp |
| |
| |
+-----------------+
4.3 生成消息内容
as_task_v2 会具体生成消息内容。大家可以看到如果实现一个消息,需要用到几个大部分:
- headers,包括:task name, task id, expires, 等等;
- 消息类型 和 编码方式:content-encoding,content-type;
- 参数:这些就是 Celery 特有的,用来区分不同队列的,比如:exchange,routing_key 等等;
- body : 就是消息体;
最终具体消息举例如下:
{
"body": "W1syLCA4XSwge30sIHsiY2FsbGJhY2tzIjogbnVsbCwgImVycmJhY2tzIjogbnVsbCwgImNoYWluIjogbnVsbCwgImNob3JkIjogbnVsbH1d",
"content-encoding": "utf-8",
"content-type": "application/json",
"headers": {
"lang": "py",
"task": "myTest.add",
"id": "243aac4a-361b-4408-9e0c-856e2655b7b5",
"shadow": null,
"eta": null,
"expires": null,
"group": null,
"group_index": null,
"retries": 0,
"timelimit": [null, null],
"root_id": "243aac4a-361b-4408-9e0c-856e2655b7b5",
"parent_id": null,
"argsrepr": "(2, 8)",
"kwargsrepr": "{}",
"origin": "gen33652@DESKTOP-0GO3RPO"
},
"properties": {
"correlation_id": "243aac4a-361b-4408-9e0c-856e2655b7b5",
"reply_to": "b34fcf3d-da9a-3717-a76f-44b6a6362da1",
"delivery_mode": 2,
"delivery_info": {
"exchange": "",
"routing_key": "celery"
},
"priority": 0,
"body_encoding": "base64",
"delivery_tag": "fa1bc9c8-3709-4c02-9543-8d0fe3cf4e6c"
}
}
具体代码如下,这里的 sent_event 是后续发送时候需要,并不体现在具体消息内容之中:
def as_task_v2(self, task_id, name, args=None, kwargs=None, ......):
......
return task_message(
headers={
'lang': 'py',
'task': name,
'id': task_id,
'shadow': shadow,
'eta': eta,
'expires': expires,
'group': group_id,
'group_index': group_index,
'retries': retries,
'timelimit': [time_limit, soft_time_limit],
'root_id': root_id,
'parent_id': parent_id,
'argsrepr': argsrepr,
'kwargsrepr': kwargsrepr,
'origin': origin or anon_nodename()
},
properties={
'correlation_id': task_id,
'reply_to': reply_to or '',
},
body=(
args, kwargs, {
'callbacks': callbacks,
'errbacks': errbacks,
'chain': chain,
'chord': chord,
},
),
sent_event={
'uuid': task_id,
'root_id': root_id,
'parent_id': parent_id,
'name': name,
'args': argsrepr,
'kwargs': kwargsrepr,
'retries': retries,
'eta': eta,
'expires': expires,
} if create_sent_event else None,
)
4.4 send_task_message in amqp
amqp.send_task_message(P, name, message, **options) 是用来 amqp 发送任务。
该方法主要是组装待发送任务的参数,如connection,queue,exchange,routing_key等,调用 producer 的 publish 发送任务。
基本套路就是:
- 获得 queue;
- 获得 delivery_mode;
- 获得 exchange;
- 获取重试策略等;
- 调用 producer 来发送消息;
def send_task_message(producer, name, message,
exchange=None, routing_key=None, queue=None,
event_dispatcher=None,
retry=None, retry_policy=None,
serializer=None, delivery_mode=None,
compression=None, declare=None,
headers=None, exchange_type=None, **kwargs):
# 获得 queue, 获得 delivery_mode, 获得 exchange, 获取重试策略等
if before_receivers:
send_before_publish(
sender=name, body=body,
exchange=exchange, routing_key=routing_key,
declare=declare, headers=headers2,
properties=properties, retry_policy=retry_policy,
)
ret = producer.publish(
body,
exchange=exchange,
routing_key=routing_key,
serializer=serializer or default_serializer,
compression=compression or default_compressor,
retry=retry, retry_policy=_rp,
delivery_mode=delivery_mode, declare=declare,
headers=headers2,
**properties
)
if after_receivers:
send_after_publish(sender=name, body=body, headers=headers2,
exchange=exchange, routing_key=routing_key)
.....
if sent_event: # 这里就处理了sent_event
evd = event_dispatcher or default_evd
exname = exchange
if isinstance(exname, Exchange):
exname = exname.name
sent_event.update({
'queue': qname,
'exchange': exname,
'routing_key': routing_key,
})
evd.publish('task-sent', sent_event,
producer, retry=retry, retry_policy=retry_policy)
return ret
return send_task_message
此时堆栈为:
send_task_message, amqp.py:473
send_task, base.py:749
apply_async, task.py:565
<module>, myclient.py:4
此时变量为:
qname = {str} 'celery'
queue = {Queue} <unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>
ContentDisallowed = {type} <class 'kombu.exceptions.ContentDisallowed'>
alias = {NoneType} None
attrs = {tuple: 18} (('name', None), ('exchange', None), ('routing_key', None), ('queue_arguments', None), ('binding_arguments', None), ('consumer_arguments', None), ('durable', <class 'bool'>), ('exclusive', <class 'bool'>), ('auto_delete', <class 'bool'>), ('no_ack', None), ('alias', None), ('bindings', <class 'list'>), ('no_declare', <class 'bool'>), ('expires', <class 'float'>), ('message_ttl', <class 'float'>), ('max_length', <class 'int'>), ('max_length_bytes', <class 'int'>), ('max_priority', <class 'int'>))
auto_delete = {bool} False
binding_arguments = {NoneType} None
bindings = {set: 0} set()
can_cache_declaration = {bool} True
channel = {str} 'Traceback (most recent call last):\n File "C:\\Program Files\\JetBrains\\PyCharm Community Edition 2020.2.2\\plugins\\python-ce\\helpers\\pydev\\_pydevd_bundle\\pydevd_resolver.py", line 178, in _getPyDictionary\n attr = getattr(var, n)\n File "C:\\User
consumer_arguments = {NoneType} None
durable = {bool} True
exchange = {Exchange} Exchange celery(direct)
exclusive = {bool} False
expires = {NoneType} None
is_bound = {bool} False
max_length = {NoneType} None
max_length_bytes = {NoneType} None
max_priority = {NoneType} None
message_ttl = {NoneType} None
name = {str} 'celery'
no_ack = {bool} False
no_declare = {NoneType} None
on_declared = {NoneType} None
queue_arguments = {NoneType} None
routing_key = {str} 'celery'
_channel = {NoneType} None
_is_bound = {bool} False
queues = {Queues: 1} {'celery': <unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>}
此时逻辑如下:
1 apply_async +-------------------+
| |
User +---------------------> | task: myTest.add |
| |
+--------+----------+
|
|
2 send_task |
|
v
+------+--------+
| Celery myTest |
| |
+------+--------+
|
|
3 send_task_message |
|
v
+-------+---------+
| amqp |
+-------+---------+
|
|
4 publish |
|
v
+----+------+
| producer |
| |
+-----------+
4.5 publish in producer
在 produer 之中,调用 channel 来发送信息。
def _publish(self, body, priority, content_type, content_encoding,
headers, properties, routing_key, mandatory,
immediate, exchange, declare):
channel = self.channel
message = channel.prepare_message(
body, priority, content_type,
content_encoding, headers, properties,
)
if declare:
maybe_declare = self.maybe_declare
[maybe_declare(entity) for entity in declare]
# handle autogenerated queue names for reply_to
reply_to = properties.get('reply_to')
if isinstance(reply_to, Queue):
properties['reply_to'] = reply_to.name
return channel.basic_publish( # 发送消息
message,
exchange=exchange, routing_key=routing_key,
mandatory=mandatory, immediate=immediate,
)
变量为:
body = {str} '[[2, 8], {}, {"callbacks": null, "errbacks": null, "chain": null, "chord": null}]'
compression = {NoneType} None
content_encoding = {str} 'utf-8'
content_type = {str} 'application/json'
declare = {list: 1} [<unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>]
delivery_mode = {int} 2
exchange = {str} ''
exchange_name = {str} ''
expiration = {NoneType} None
headers = {dict: 15} {'lang': 'py', 'task': 'myTest.add', 'id': 'af0e4c14-a618-41b4-9340-1479cb7cde4f', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': 'af0e4c14-a618-41b4-9340-1479cb7cde4f', 'parent_id': None, 'argsrepr': '(2, 8)', 'kwargsrepr': '{}', 'origin': 'gen11468@DESKTOP-0GO3RPO'}
immediate = {bool} False
mandatory = {bool} False
priority = {int} 0
properties = {dict: 3} {'correlation_id': 'af0e4c14-a618-41b4-9340-1479cb7cde4f', 'reply_to': '2c938063-64b8-35f5-ac9f-a1c0915b6f71', 'delivery_mode': 2}
retry = {bool} True
retry_policy = {dict: 4} {'max_retries': 3, 'interval_start': 0, 'interval_max': 1, 'interval_step': 0.2}
routing_key = {str} 'celery'
self = {Producer} <Producer: <promise: 0x1eeb62c44c8>>
serializer = {str} 'json'
此时逻辑为:
1 apply_async +-------------------+
| |
User +---------------------> | task: myTest.add |
| |
+--------+----------+
|
2 send_task |
|
v
+------+--------+
| Celery myTest |
| |
+------+--------+
|
3 send_task_message |
|
v
+-------+---------+
| amqp |
+-------+---------+
|
4 publish |
|
v
+----+------+
| producer |
| |
+----+------+
|
|
5 basic_publish |
v
+----+------+
| channel |
| |
+-----------+
至此一个任务就发送出去,等待着消费者消费掉任务。
4.6 redis 内容
发送之后,task 就被存储在redis的队列之中。在redis 的结果是:
127.0.0.1:6379> keys *
1) "_kombu.binding.reply.testMailbox.pidbox"
2) "_kombu.binding.testMailbox.pidbox"
3) "celery"
4) "_kombu.binding.celeryev"
5) "_kombu.binding.celery"
6) "_kombu.binding.reply.celery.pidbox"
127.0.0.1:6379> lrange celery 0 -1
1) "{\"body\": \"W1syLCA4XSwge30sIHsiY2FsbGJhY2tzIjogbnVsbCwgImVycmJhY2tzIjogbnVsbCwgImNoYWluIjogbnVsbCwgImNob3JkIjogbnVsbH1d\", \"content-encoding\": \"utf-8\", \"content-type\": \"application/json\", \"headers\": {\"lang\": \"py\", \"task\": \"myTest.add\", \"id\": \"243aac4a-361b-4408-9e0c-856e2655b7b5\", \"shadow\": null, \"eta\": null, \"expires\": null, \"group\": null, \"group_index\": null, \"retries\": 0, \"timelimit\": [null, null], \"root_id\": \"243aac4a-361b-4408-9e0c-856e2655b7b5\", \"parent_id\": null, \"argsrepr\": \"(2, 8)\", \"kwargsrepr\": \"{}\", \"origin\": \"gen33652@DESKTOP-0GO3RPO\"}, \"properties\": {\"correlation_id\": \"243aac4a-361b-4408-9e0c-856e2655b7b5\", \"reply_to\": \"b34fcf3d-da9a-3717-a76f-44b6a6362da1\", \"delivery_mode\": 2, \"delivery_info\": {\"exchange\": \"\", \"routing_key\": \"celery\"}, \"priority\": 0, \"body_encoding\": \"base64\", \"delivery_tag\": \"fa1bc9c8-3709-4c02-9543-8d0fe3cf4e6c\"}}"
4.6.1 delivery_tag 作用
可以看到,最终消息中,有一个 delivery_tag 变量,这里要特殊说明下。
可以认为 delivery_tag 是消息在 redis 之中的唯一标示,是 UUID 格式。
具体举例如下:
"delivery_tag": "fa1bc9c8-3709-4c02-9543-8d0fe3cf4e6c"
。
后续 QoS 就使用 delivery_tag 来做各种处理,比如 ack, snack。
with self.pipe_or_acquire() as pipe:
pipe.zadd(self.unacked_index_key, *zadd_args) \
.hset(self.unacked_key, delivery_tag,
dumps([message._raw, EX, RK])) \
.execute()
super().append(message, delivery_tag)
4.6.2 delivery_tag 何时生成
我们关心的是在发送消息时候,何时生成 delivery_tag。
结果发现是在 Channel 的 _next_delivery_tag 函数中,是在发送消息之前,对消息做了进一步增强。
def _next_delivery_tag(self):
return uuid()
具体堆栈如下:
_next_delivery_tag, base.py:595
_inplace_augment_message, base.py:614
basic_publish, base.py:599
_publish, messaging.py:200
_ensured, connection.py:525
publish, messaging.py:178
send_task_message, amqp.py:532
send_task, base.py:749
apply_async, task.py:565
<module>, myclient.py:4
至此,客户端发送 task 的流程已经结束,有兴趣的可以看看 [源码解析] 并行分布式任务队列 Celery 之 消费动态流程 此文从服务端角度讲解收到 Task 如何消费。