Skip to content
This repository was archived by the owner on Dec 4, 2024. It is now read-only.
This repository was archived by the owner on Dec 4, 2024. It is now read-only.

Generate Server id using long time #675

@sharego

Description

@sharego

In Our Production Environment, After Deploy a Marathon Application, the Marathon-LB, keep log out:
marathon_lb: server id collision for xxx was already assigned, retrying with xxx

Which rised the haproxy configuration file not be updated, then all new tasks cannot be accessed.

To recover production access, we deleted the old version tasks, then marathon-lb make new haproxy configuration file successfully, and stop the error logging.

Unlucky, we didnot reproduce the situation.

Some scene data

  1. mesos version: 1.8.2
  2. marathon version: 1.6.549
  3. marathon-lb version: 1.14.0
  4. haproxy version: 2.0.3
  5. marathon-lb mode: sse
  6. running mode: docker container
  7. before recover, max new server name length: recurved 84 times with 5417 bytes
  8. before the marathon_lb: server id collision , the stdout file already repeat marathon_lb: backend server xxx on yyyy

Our analysis:

For some special reason, app backends reduplicated in a app or cross apps, the server id in haproxy is a global value.
And the method calculate_server_id is a recurving function, it will keep calling util find a not assigned value, and stop update haproxy configuration file.

Suggest Solution:

The backends property of MarathonService is a set type, so should define a hash method for the MarathonBackend object.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions