patroni.dcs.etcd module

class patroni.dcs.etcd.AbstractEtcd(config: Dict[str, Any], mpp: AbstractMPP, client_cls: Type[AbstractEtcdClientWithFailover], retry_errors_cls: Type[Exception] | Tuple[Type[Exception], ...])View on GitHub

Bases: AbstractDCS

__init__(config: Dict[str, Any], mpp: AbstractMPP, client_cls: Type[AbstractEtcdClientWithFailover], retry_errors_cls: Type[Exception] | Tuple[Type[Exception], ...]) NoneView on GitHub

Prepare DCS paths, MPP object, initial values for state information and processing dependencies.

Variables:

configdict, reference to config section of selected DCS. i.e.: zookeeper for zookeeper, etcd for etcd, etc…

_abc_impl = <_abc._abc_data object>
abstract property _client: AbstractEtcdClientWithFailover

return correct type of etcd client

_handle_exception(e: Exception, name: str = '', do_sleep: bool = False, raise_ex: Exception | None = None) NoneView on GitHub
_run_and_handle_exceptions(method: Callable[[...], Any], *args: Any, **kwargs: Any) AnyView on GitHub
get_etcd_client(config: Dict[str, Any], client_cls: Type[AbstractEtcdClientWithFailover]) AbstractEtcdClientWithFailoverView on GitHub
handle_etcd_exceptions(func: Callable[[...], Any], *args: Any, **kwargs: Any) AnyView on GitHub
reload_config(config: Config | Dict[str, Any]) NoneView on GitHub

Load and set relevant values from configuration.

Sets loop_wait, ttl and retry_timeout properties.

Parameters:

config – Loaded configuration information object or dictionary of key value pairs.

retry(method: Callable[[...], Any], *args: Any, **kwargs: Any) AnyView on GitHub
set_retry_timeout(retry_timeout: int) NoneView on GitHub

Set the new value for retry_timeout.

set_socket_options(sock: socket, socket_options: Collection[Tuple[int, int, int]] | None) NoneView on GitHub
set_ttl(ttl: int) bool | NoneView on GitHub

Set the new ttl value for DCS keys.

property ttl: int

Get current ttl value.

class patroni.dcs.etcd.AbstractEtcdClientWithFailover(config: Dict[str, Any], dns_resolver: DnsCachingResolver, cache_ttl: int = 300)View on GitHub

Bases: ABC, Client

ERROR_CLS: Type[Exception]
__init__(config: Dict[str, Any], dns_resolver: DnsCachingResolver, cache_ttl: int = 300) NoneView on GitHub
_abc_impl = <_abc._abc_data object>
_calculate_timeouts(etcd_nodes: int, timeout: float | None = None) Tuple[int, float, int]View on GitHub

Calculate a request timeout and number of retries per single etcd node. In case if the timeout per node is too small (less than one second) we will reduce the number of nodes. For the cluster with only one node we will try to do 2 retries. For clusters with 2 nodes we will try to do 1 retry for every node. No retries for clusters with 3 or more nodes. We better rely on switching to a different node.

_do_http_request(retry: Retry | None, machines_cache: List[str], request_executor: Callable[[...], HTTPResponse], method: str, path: str, fields: Dict[str, Any] | None = None, **kwargs: Any) HTTPResponseView on GitHub
_get_headers() Dict[str, str]View on GitHub
_get_machines_cache_from_config() List[str]View on GitHub
_get_machines_cache_from_dns(host: str, port: int) List[str]View on GitHub

One host might be resolved into multiple ip addresses. We will make list out of it

_get_machines_cache_from_srv(srv: str, srv_suffix: str | None = None) List[str]View on GitHub

Fetch list of etcd-cluster member by resolving _etcd-server._tcp. SRV record. This record should contain list of host and peer ports which could be used to run ‘GET http://{host}:{port}/members’ request (peer protocol)

_get_machines_list(machines_cache: List[str]) List[str]View on GitHub

Gets list of members from Etcd cluster using API

Parameters:

machines_cache – initial list of Etcd members

Returns:

list of clientURLs retrieved from Etcd cluster

Raises:

EtcdConnectionFailed – if failed

abstract _get_members(base_uri: str, **kwargs: Any) List[str]View on GitHub

returns: list of clientURLs

_load_machines_cache() boolView on GitHub

This method should fill up _machines_cache from scratch. It could happen only in two cases: 1. During class initialization 2. When all etcd members failed

_prepare_common_parameters(etcd_nodes: int, timeout: float | None = None) Dict[str, Any]View on GitHub
abstract _prepare_get_members(etcd_nodes: int) Dict[str, Any]View on GitHub

returns: request parameters

abstract _prepare_request(kwargs: Dict[str, Any], params: Dict[str, Any] | None = None, method: str | None = None) Callable[[...], HTTPResponse]View on GitHub

returns: request_executor

_refresh_machines_cache(machines_cache: List[str] | None = None) boolView on GitHub

Get etcd cluster topology using Etcd API and put it to self._machines_cache

Parameters:

machines_cache – the list of nodes we want to run through executing API request in addition to values stored in the self._machines_cache

Returns:

True if self._machines_cache was updated with new values

Raises:

EtcdException – if failed to get topology and machines_cache was specified.

The self._machines_cache will not be updated if nodes from the list are not accessible or if they are not returning correct results.

static _update_dns_cache(func: Callable[[str, int], None], machines: List[str]) NoneView on GitHub
api_execute(path: str, method: str, params: Dict[str, Any] | None = None, timeout: float | None = None) AnyView on GitHub

Executes the query.

static get_srv_record(host: str) List[Tuple[str, int]]View on GitHub
property machines: List[str]

Original machines method(property) of etcd.Client class raise exception when it failed to get list of etcd cluster members. This method is being called only when request failed on one of the etcd members during api_execute call. For us it’s more important to execute original request rather then get new topology of etcd cluster. So we will catch this exception and return empty list of machines. Later, during next api_execute call we will forcefully update machines_cache.

Also this method implements the same timeout-retry logic as api_execute, because the original method was retrying 2 times with the read_timeout on each node.

After the next refactoring the whole logic was moved to the _get_machines_list() method.

property machines_cache: List[str]
reload_config(config: Dict[str, Any]) NoneView on GitHub
set_base_uri(value: str) NoneView on GitHub
set_machines_cache_ttl(cache_ttl: int) NoneView on GitHub
set_read_timeout(timeout: float) NoneView on GitHub
class patroni.dcs.etcd.DnsCachingResolver(cache_time: float = 600.0, cache_fail_time: float = 30.0)View on GitHub

Bases: Thread

__init__(cache_time: float = 600.0, cache_fail_time: float = 30.0) NoneView on GitHub

This constructor should always be called with keyword arguments. Arguments are:

group should be None; reserved for future extension when a ThreadGroup class is implemented.

target is the callable object to be invoked by the run() method. Defaults to None, meaning nothing is called.

name is the thread name. By default, a unique name is constructed of the form “Thread-N” where N is a small decimal number.

args is a list or tuple of arguments for the target invocation. Defaults to ().

kwargs is a dictionary of keyword arguments for the target invocation. Defaults to {}.

If a subclass overrides the constructor, it must make sure to invoke the base class constructor (Thread.__init__()) before doing anything else to the thread.

static _do_resolve(host: str, port: int) List[Tuple[AddressFamily, SocketKind, int, str, Tuple[str, int] | Tuple[str, int, int, int]]]View on GitHub
remove(host: str, port: int) NoneView on GitHub
resolve(host: str, port: int) List[Tuple[AddressFamily, SocketKind, int, str, Tuple[str, int] | Tuple[str, int, int, int]]]View on GitHub
resolve_async(host: str, port: int, attempt: int = 0) NoneView on GitHub
run() NoneView on GitHub

Method representing the thread’s activity.

You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.

class patroni.dcs.etcd.Etcd(config: Dict[str, Any], mpp: AbstractMPP)View on GitHub

Bases: AbstractEtcd

__init__(config: Dict[str, Any], mpp: AbstractMPP) NoneView on GitHub

Prepare DCS paths, MPP object, initial values for state information and processing dependencies.

Variables:

configdict, reference to config section of selected DCS. i.e.: zookeeper for zookeeper, etcd for etcd, etc…

_abc_impl = <_abc._abc_data object>
property _client: EtcdClient

return correct type of etcd client

_cluster_from_nodes(etcd_index: int, nodes: Dict[str, EtcdResult]) ClusterView on GitHub
_delete_leader(*args: Any, **kwargs: Any) AnyView on GitHub

Remove leader key from DCS.

This method should remove leader key if current instance is the leader.

Parameters:

leaderLeader object with information about the leader.

Returns:

True if successfully committed to DCS.

_do_attempt_to_acquire_leader() boolView on GitHub
_do_update_leader() boolView on GitHub
_load_cluster(path: str, loader: Callable[[str], Cluster | Dict[int, Cluster]]) Cluster | Dict[int, Cluster]View on GitHub

Main abstract method that implements the loading of Cluster instance.

Note

Internally this method should call the loader method that will build Cluster object which represents current state and topology of the cluster in DCS. This method supposed to be called only by the get_cluster() method.

Parameters:
  • path – the path in DCS where to load Cluster(s) from.

  • loader – one of _postgresql_cluster_loader() or _mpp_cluster_loader().

Raise:

DCSError in case of communication problems with DCS. If the current node was running as a primary and exception raised, instance would be demoted.

_mpp_cluster_loader(path: str) Dict[int, Cluster]View on GitHub

Load and build all PostgreSQL clusters from a single MPP cluster.

Parameters:

path – the path in DCS where to load Cluster(s) from.

Returns:

all MPP groups as dict, with group IDs as keys and Cluster objects as values.

_postgresql_cluster_loader(path: str) ClusterView on GitHub

Load and build the Cluster object from DCS, which represents a single PostgreSQL cluster.

Parameters:

path – the path in DCS where to load Cluster from.

Returns:

Cluster instance.

_update_leader(**kwargs: Any)View on GitHub

Update leader key (or session) ttl.

Note

You have to use CAS (Compare And Swap) operation in order to update leader key, for example for etcd prevValue parameter must be used.

If update fails due to DCS not being accessible or because it is not able to process requests (hopefully temporary), the DCSError exception should be raised.

Parameters:

leader – a reference to a current leader object.

Returns:

True if leader key (or session) has been updated successfully.

_write_failsafe(*args: Any, **kwargs: Any) AnyView on GitHub

Write current cluster topology to DCS that will be used by failsafe mechanism (if enabled).

Parameters:

value – failsafe topology serialized in JSON format.

Returns:

True if successfully committed to DCS.

_write_leader_optime(*args: Any, **kwargs: Any) AnyView on GitHub

Write current WAL LSN into /optime/leader key in DCS.

Parameters:

last_lsn – absolute WAL LSN in bytes.

Returns:

True if successfully committed to DCS.

_write_status(*args: Any, **kwargs: Any) AnyView on GitHub

Write current WAL LSN and confirmed_flush_lsn of permanent slots into the /status key in DCS.

Parameters:

value – status serialized in JSON format.

Returns:

True if successfully committed to DCS.

attempt_to_acquire_leader(**kwargs: Any)View on GitHub

Attempt to acquire leader lock.

Note

This method should create /leader key with the value _name.

The key must be created atomically. In case the key already exists it should not be overwritten and False must be returned.

If key creation fails due to DCS not being accessible or because it is not able to process requests (hopefully temporary), the DCSError exception should be raised.

Returns:

True if key has been created successfully.

cancel_initialization(*args: Any, **kwargs: Any) AnyView on GitHub

Removes the initialize key for a cluster.

Returns:

True if successfully committed to DCS.

delete_cluster(*args: Any, **kwargs: Any) AnyView on GitHub

Delete cluster from DCS.

Returns:

True if successfully committed to DCS.

delete_sync_state(*args: Any, **kwargs: Any) AnyView on GitHub

Delete the synchronous state from DCS.

Parameters:

version – for conditional deletion of the key/object.

Returns:

True if delete successful.

initialize(*args: Any, **kwargs: Any) AnyView on GitHub

Race for cluster initialization.

This method should atomically create initialize key and return True, otherwise it should return False.

Parameters:
  • create_newFalse if the key should already exist (in the case we are setting the system_id).

  • sysid – PostgreSQL cluster system identifier, if specified, is written to the key.

Returns:

True if key has been created successfully.

static member(node: EtcdResult) MemberView on GitHub
set_config_value(*args: Any, **kwargs: Any) AnyView on GitHub

Create or update /config key in DCS.

Parameters:
  • value – new value to set in the config key.

  • version – for conditional update of the key/object.

Returns:

True if successfully committed to DCS.

set_failover_value(*args: Any, **kwargs: Any) AnyView on GitHub

Create or update /failover key.

Parameters:
  • value – value to set.

  • version – for conditional update of the key/object.

Returns:

True if successfully committed to DCS.

set_history_value(*args: Any, **kwargs: Any) AnyView on GitHub

Set value for history in DCS.

Parameters:

value – new value of history key/object.

Returns:

True if successfully committed to DCS.

set_sync_state_value(*args: Any, **kwargs: Any) AnyView on GitHub

Set synchronous state in DCS.

Parameters:
  • value – the new value of /sync key.

  • version – for conditional update of the key/object.

Returns:

version of the new object or False in case of error.

set_ttl(ttl: int) bool | NoneView on GitHub

Set the new ttl value for DCS keys.

take_leader(*args: Any, **kwargs: Any) AnyView on GitHub

Establish a new leader in DCS.

Note

This method should create leader key with value of _name and ttl of ttl.

Since it could be called only on initial cluster bootstrap it could create this key regardless, overwriting the key if necessary.

Returns:

True if successfully committed to DCS.

touch_member(*args: Any, **kwargs: Any) AnyView on GitHub

Update member key in DCS.

Note

This method should create or update key with the name with /members/ + _name and the value of data in a given DCS.

Parameters:

data – information about an instance (including connection strings).

Returns:

True if successfully committed to DCS.

watch(leader_version: int | None, timeout: float) boolView on GitHub

Sleep if the current node is a leader, otherwise, watch for changes of leader key with a given timeout.

Parameters:
  • leader_version – version of a leader key.

  • timeout – timeout in seconds.

Returns:

if True this will reschedule the next run of the HA cycle.

class patroni.dcs.etcd.EtcdClient(config: Dict[str, Any], dns_resolver: DnsCachingResolver, cache_ttl: int = 300)View on GitHub

Bases: AbstractEtcdClientWithFailover

ERROR_CLSView on GitHub

alias of EtcdError

__init__(config: Dict[str, Any], dns_resolver: DnsCachingResolver, cache_ttl: int = 300) NoneView on GitHub
_abc_impl = <_abc._abc_data object>
_get_members(base_uri: str, **kwargs: Any) List[str]View on GitHub

returns: list of clientURLs

_prepare_get_members(etcd_nodes: int) Dict[str, Any]View on GitHub

returns: request parameters

_prepare_request(kwargs: Dict[str, Any], params: Dict[str, Any] | None = None, method: str | None = None) Callable[[...], HTTPResponse]View on GitHub

returns: request_executor

exception patroni.dcs.etcd.EtcdError(value: Any)View on GitHub

Bases: DCSError

exception patroni.dcs.etcd.EtcdRaftInternal(message=None, payload=None)View on GitHub

Bases: EtcdException

Raft Internal Error

patroni.dcs.etcd.catch_etcd_errors(func: Callable[[...], Any]) AnyView on GitHub