ExCore

ExCore is a Configuration/Registry System designed for deeplearning, with some utils.

✨ ExCore supports auto-completion, type-hinting, docstring and code navigation for config files

ExCore is still in an early development stage.

English | 中文

Features

Config System

Config system is the core of deeplearning projects which enable us to manage and adjust hyperparameters and expriments. There are some attempts of config system because the whole community has been suffering from the plain text config files for a long while.

Config System in ExCore is specifically designed for deeplearning training (generally refers to all similar part, e.g. testing, evaluating) procedure. The core premise is to categorize the objects to be created in the config into three classes - Primary, Intermediate, and Isolated objects

Primary objects are those which are directly used in training, e.g. model, optimizer. ExCore will instantiate and return them.
Intermediate objects are those which are indirectly used in training, e.g. backbone of the model, parameters of model that will pass to optimizer. ExCore will instantiate them, and pass them to target Primary objects as arguments according some rules.
Isolated objects refer to python built-in objects which will be parsed when loading toml, e.g. int, string, list and dict.

ExCore extends the syntax of toml file, introducing some special prefix characters -- !, @, $ and & to simplify the config definition.

So we introduce some terminologies in ExCore:

PrimaryField: e.g. Model, TrainData, TestData and so on.
RegistryName: The name of a Registry, e.g. Models, Datasets, Losses and so on. It can be the same as PrimaryField.
ModuleName: All registered items (class, function, module) are called Module. So the name of it is ModuleName.

The config system has following features.

Get rid of `type`

Model:
  type: ResNet # <----- ugly type
  layers: 50
  num_classes: 1

In order to get rid of type, ExCore regards all registered names as reserved words. The Primary module need to be defined like [PrimaryField.ModuleName]. PrimaryField are some pre-defined fields, e.g. Model, Optimizer. ModuleName are registered names.

[Model.FCN]
layers = 50
num_classes = 1

Eliminate modules nesting

TrainData:
  type: Cityscapes
  dataset_root: data/cityscapes
  transforms:
   - type: ResizeStepScale
     min_scale_factor: 0.5
     max_scale_factor: 2.0
     scale_step_size: 0.25
   - type: RandomPaddingCrop
        crop_size: [1024, 512]
   - type: Normalize
  mode: train

ExCore use some special prefix characters to specify certain arguments are modules as well. More prefixes will be introduced later.

[TrainData.Cityscapes]
dataset_root = "data/cityscapes"
mode = 'train'
# use `!` to show this is a module, It's formal to use a quoted key "!transforms", but whatever
!transforms = ["ResizeStepScale", "RandomPaddingCrop", "Normalize"]

# `PrimaryField` can be omitted in definition of `Intermediate` module
[ResizeStepScale]
min_scale_factor = 0.5
max_scale_factor = 2.0
scale_step_size = 0.25

# or explicitly specify `PrimaryField`
[Transforms.RandomPaddingCrop]
crop_size = [1024, 512]

# It can even be undefined when there are no arguments
# [Normalize]

✨Auto-complement, type-hinting, docstring and code navigation for config files

The old-style design of plain text configs has been criticized for being difficult to write (without auto-completion) and not allowing navigation to the corresponding class. However, Language Server Protocol can be leveraged to support various code editing features, such as auto-completion, type-hinting, and code navigation. By utilizing lsp and json schema, it's able to provide the ability of auto-completion, some weak type-hinting (If code is well annotated, such as standard type hint in python, it will achieve more) and docstring of corresponding class.

ExCore dump the mappings of class name and it file location to support code navigation. Currently only support for neovim, see excore.nvim.

Config inheritance

Use `__base__` to inherit from a toml file. Only dict can be updated locally, other types are overwritten directly.

__base__ = ["xxx.toml", "xxxx.toml"]

`@`Reused module

ExCore use @ to mark the reused module, which is shared between different modules.

# FCN and SegNet will use the same ResNet object
[Model.FCN]
@backbone = "ResNet"

[Model.SegNet]
@backbone = "ResNet"

[ResNet]
layers = 50
in_channel = 3

equls to

resnet = ResNet(layers=50, in_channel=3)

FCN(backbone=resnet)
SegNet(backbone=resnet)

# If use `!`, it equls to

FCN(backbone=ResNet(layers=50, in_channel=3))
SegNet(backbone=ResNet(layers=50, in_channel=3))

`$`Refer Class and cross file

ExCore use $ to represents class itself, which will not be instantiated.

[Model.ResNet]
$block = "BasicBlock"
layers = 50
in_channel = 3

equls to

from xxx import ResNet, BasicBlock
ResNet(block=BasicBlock, layers=50, in_channel=3)

In order to refer module across files, $ can be used before PrimaryField. For example:

File A:

[Block.BasicBlock]

File B:

[Block.BottleneckBlock]

File C:

[Model.ResNet]
!block="$Block"

So we can combine file A and C or file B and C with a toml file

__base__ = ["A.toml", "C.toml"]
# or
__base__ = ["B.toml", "C.toml"]

`&`Variable reference

ExCore use & to refer a variable from the top-level of config.

size = 224

[TrainData.ImageNet]
&train_size = "size"
!transforms = ['RandomResize', 'Pad']
data_path = 'xxx'

[Transform.Pad]
&pad_size = "size"

[TestData.ImageNet]
!transforms = ['Normalize']
&test_size = "size"
data_path = 'xxx'

& can use in the parameters as well. It's always used with the argument hooks. Refer to finegrained_config.

✨Using python module in config file

The Registry in ExCore is able to register a module:

from excore import Registry
import torch

MODULE = Registry("module")
MODULE.register_module(torch)

Then you can use torch in config file:

[Model.ResNet]
$activation = "torch.nn.ReLU"
# or
$activation = "torch.nn.ReLU()"
# or, note: implement with eval
$activation = "torch.nn.ReLU(inplace)"

import torch
from xxx import ResNet

ResNet(torch.nn.ReLU)
# or
ResNet(torch.nn.ReLU())
# or
ResNet(torch.nn.ReLU(inplace=True))

✨Argument-level hook

ExCore provide a simple way to call argument-level hooks without arguments.

[Optimizer.AdamW]
@params = "$Model.parameters()"
weight_decay = 0.01

If you want to call a class or static method.

[Model.XXX]
$backbone = "A.from_pretained()"

Attributes can also be used.

[Model.XXX]
!channel = "$Block.out_channel"

It also can be chained invoke.

[Model.XXX]
!channel = "$Block.last_conv.out_channels"

This way requests you to define such methods or attributes in target class and can not pass arguments. So ExCore provides ConfigArgumentHook.

class ConfigArgumentHook(node, enabled)

You need to implements your own class inherited from ConfigArgumentHook. For example:

from excore.engine.hook import ConfigArgumentHook

from . import HOOKS


@HOOKS.register()
class BnWeightDecayHook(ConfigArgumentHook):
    def __init__(self, node, enabled: bool, bn_weight_decay: bool, weight_decay: float):
        super().__init__(node, enabled)
        self.bn_weight_decay = bn_weight_decay
        self.weight_decay = weight_decay

    def hook(self):
        model = self.node()
        if self.bn_weight_decay:
            optim_params = model.parameters()
        else:
            p_bn = [p for n, p in model.named_parameters() if "bn" in n]
            p_non_bn = [p for n, p in model.named_parameters() if "bn" not in n]
            optim_params = [
                {"params": p_bn, "weight_decay": 0},
                {"params": p_non_bn, "weight_decay": self.weight_decay},
            ]
        return optim_params

[Optimizer.SGD]
@params = "$Model@BnWeightDecayHook"
lr = 0.05
momentum = 0.9
weight_decay = 0.0001

[ConfigHook.BnWeightDecayHook]
weight_decay = 0.0001
bn_weight_decay = false
enabled = true

Use @ to call user defined hooks.

Instance-level hook

If the logic of module building are too complicated, instance-level hook may be helpful.

TODO

✨Lazy Config with simple API

The core conception of LazyConfig is 'Lazy', which represents a status of delay. Before instantiating, all the parameters will be stored in a special dict which additionally contains what the target class is. So It's easy to alter any parameters of the module and control which module should be instantiated and which module should not.

It's also used to address the defects of plain text configs through python lsp which is able to provide code navigation, auto-completion and more.

ExCore implements some nodes - ModuleNode, InternNode, ReusedNode, ClassNode, ConfigHookNode, GetAttr and VariableReference and a LazyConfig to manage all nodes.

ExCore provides only 2 simple API to build modules -- 'load' and build_all.

Typically, we follow the following procedure.

from excore import config
layz_cfg = config.load('xxx.toml')
module_dict, run_info = config.build_all(layz_cfg)

The results of build_all are respectively Primary modules and Isolated objects.

If you only want to use a certain module.

from excore import config
layz_cfg = config.load('xxx.toml')
model = lazy_cfg.Model() # Model is one of `PrimaryField`
# or
model = layz_cfg['Model']()

If you want to follow other logic to build modules, you can still use LazyConfig to adjust the arguments of nodes and more things.

from excore import config
layz_cfg = config.load('xxx.toml')
lazy_cfg.Model << dict(pre_trained='./')
# or
lazy_cfg.Model.add(pre_trained='./')

module_dict, run_info = config.build_all(layz_cfg)

✨Module validation and lazy assignment

Validate parameters of modules before their initialization and call, which will save time from some serial long initialization.

If there is any parameter missing, you can manually assign it to avoid crushing. It will be parsed to str, int, list, tuple, or dict.

Use environment variable EXCORE_VALIDATE and EXCORE_MANUAL_SET to control whether validate and assign.

Config print

from excore import config
cfg = config.load_config('xx.toml')
print(cfg)

Result:

╒══════════════════════════╤══════════════════════════════════════════════════════════════════════╕
│ size                     │ 1024                                                                 │
├──────────────────────────┼──────────────────────────────────────────────────────────────────────┤
│ TrainData.CityScapes     │ ╒═════════════╤════════════════════════════════════════════════════╕ │
│                          │ │ &train_size │ size                                               │ │
│                          │ ├─────────────┼────────────────────────────────────────────────────┤ │
│                          │ │ !transforms │ ['RandomResize', 'RandomFlip', 'Normalize', 'Pad'] │ │
│                          │ ├─────────────┼────────────────────────────────────────────────────┤ │
│                          │ │ data_path   │ xxx                                                │ │
│                          │ ╘═════════════╧════════════════════════════════════════════════════╛ │
├──────────────────────────┼──────────────────────────────────────────────────────────────────────┤
│ Transform.RandomFlip     │ ╒══════╤═════╕                                                       │
│                          │ │ prob │ 0.5 │                                                       │
│                          │ ├──────┼─────┤                                                       │
│                          │ │ axis │ 0   │                                                       │
│                          │ ╘══════╧═════╛                                                       │
├──────────────────────────┼──────────────────────────────────────────────────────────────────────┤
│ Transform.Pad            │ ╒═══════════╤══════╕                                                 │
│                          │ │ &pad_size │ size │                                                 │
│                          │ ╘═══════════╧══════╛                                                 │
├──────────────────────────┼──────────────────────────────────────────────────────────────────────┤
│ Normalize.std            │ [0.5, 0.5, 0.5]                                                      │
├──────────────────────────┼──────────────────────────────────────────────────────────────────────┤
│ Normalize.mean           │ [0.5, 0.5, 0.5]                                                      │
├──────────────────────────┼──────────────────────────────────────────────────────────────────────┤
│ TestData.CityScapes      │ ╒═════════════╤═══════════════╕                                      │
│                          │ │ !transforms │ ['Normalize'] │                                      │
│                          │ ├─────────────┼───────────────┤                                      │
│                          │ │ &test_size  │ size          │                                      │
│                          │ ├─────────────┼───────────────┤                                      │
│                          │ │ data_path   │ xxx           │                                      │
│                          │ ╘═════════════╧═══════════════╛                                      │
├──────────────────────────┼──────────────────────────────────────────────────────────────────────┤
│ Model.FCN                │ ╒═══════════╤════════════╕                                           │
│                          │ │ @backbone │ ResNet     │                                           │
│                          │ ├───────────┼────────────┤                                           │
│                          │ │ @head     │ SimpleHead │                                           │
│                          │ ╘═══════════╧════════════╛                                           │
...

Registry

✨LazyRegistry

To reduce the unnecessary imports, `ExCore` provides `LazyRegistry`, which store the mappings of class/function name to its `qualname` and dump the mappings to local. When config parsing, the necessary modules will be imported.

Extra information

from excore import Registry

Models = Registry("Model", extra_field="is_backbone")


@Models.register(is_backbone=True)
class ResNet:
    pass

Modules classification and fuzzy search

from excore import Registry

Models = Registry("Model", extra_field="is_backbone")


@Models.register(is_backbone=True)
class ResNet:
    pass

@Models.register(is_backbone=True)
class ResNet50:
    pass

@Models.register(is_backbone=True)
class ResNet101:
    pass

@Models.register(is_backbone=False)
class head:
    pass


print(Models.module_table(select_info='is_backbone'))

print(Models.module_table(filter='**Res**'))

results:

  ╒═══════════╤═══════════════╕
  │ Model     │ is_backbone   │
  ╞═══════════╪═══════════════╡
  │ ResNet    │ True          │
  ├───────────┼───────────────┤
  │ ResNet101 │ True          │
  ├───────────┼───────────────┤
  │ ResNet50  │ True          │
  ├───────────┼───────────────┤
  │ head      │ False         │
  ╘═══════════╧═══════════════╛

  ╒═══════════╕
  │ Model     │
  ╞═══════════╡
  │ ResNet    │
  ├───────────┤
  │ ResNet101 │
  ├───────────┤
  │ ResNet50  │
  ╘═══════════╛

Register all

from torch import optim
from excore import Registry

OPTIM = Registry("Optimizer")


def _get_modules(name: str, module) -> bool:
    if name[0].isupper():
        return True
    return False


OPTIM.match(optim, _get_modules)
print(OPTIM)

results:

╒════════════╤════════════════════════════════════╕
│ NAME       │ DIR                                │
╞════════════╪════════════════════════════════════╡
│ Adadelta   │ torch.optim.adadelta.Adadelta      │
├────────────┼────────────────────────────────────┤
│ Adagrad    │ torch.optim.adagrad.Adagrad        │
├────────────┼────────────────────────────────────┤
│ Adam       │ torch.optim.adam.Adam              │
├────────────┼────────────────────────────────────┤
│ AdamW      │ torch.optim.adamw.AdamW            │
├────────────┼────────────────────────────────────┤
│ SparseAdam │ torch.optim.sparse_adam.SparseAdam │
├────────────┼────────────────────────────────────┤
│ Adamax     │ torch.optim.adamax.Adamax          │
├────────────┼────────────────────────────────────┤
│ ASGD       │ torch.optim.asgd.ASGD              │
├────────────┼────────────────────────────────────┤
│ SGD        │ torch.optim.sgd.SGD                │
├────────────┼────────────────────────────────────┤
│ RAdam      │ torch.optim.radam.RAdam            │
├────────────┼────────────────────────────────────┤
│ Rprop      │ torch.optim.rprop.Rprop            │
├────────────┼────────────────────────────────────┤
│ RMSprop    │ torch.optim.rmsprop.RMSprop        │
├────────────┼────────────────────────────────────┤
│ Optimizer  │ torch.optim.optimizer.Optimizer    │
├────────────┼────────────────────────────────────┤
│ NAdam      │ torch.optim.nadam.NAdam            │
├────────────┼────────────────────────────────────┤
│ LBFGS      │ torch.optim.lbfgs.LBFGS            │
╘════════════╧════════════════════════════════════╛

All in one

Through Registry to find all registries. Make registries into a global one.

from excore import Registry

MODEL = Registry.get_registry("Model")

G = Registry.make_global()

✨Register module

Registry is able to not only register class or function, but also a python module, for example:

from excore import Registry
import torch

MODULE = Registry("module")
MODULE.register_module(torch)

Then you can use torch in config file:

[Model.ResNet]
$activation = "torch.nn.ReLU"
# or
!activation = "torch.nn.ReLU"

equls to

import torch
from xxx import ResNet

ResNet(torch.nn.ReLU)
# or
ResNet(torch.nn.ReLU())

Plugins

PathManager

Manage paths in a structured manner for creating directories, if the scoped functions fail, it can automatically delete the created directories.

from excore.plugins.path_manager import PathManager

with PathManager(
    base_path = "./exp",
    sub_folders=["folder1", "folder2"],
    config_name="config_dir",
    instance_name="test1",
    remove_if_fail=True,
    sub_folder_exist_ok=False,
    config_name_first=False,
    return_str=True,
) as pm:
    folder1_path:str = pm.get("folder1")
    folder2_path:str = pm.get("folder2")
    do_sth(folder1_path, folder2_path)
    train()

The structure will be

exp
├── folder1
│   └── config_dir
│       └── test1
└── folder2
    └── config_dir
        └── test1

You can also use the dataclass for a better experience:

from dataclasses import dataclass

from excore.plugins.path_manager import PathManager


@dataclass
class SubPath:
    folder1: str = "folder1"
    folder2: str = "folder2"

sub_path = SubPath()

with PathManager(
    base_path = "./exp",
    sub_folders=sub_path,
    config_name="config_dir",
    instance_name="test1",
    remove_if_fail=True,
    sub_folder_exist_ok=False,
    config_name_first=False,
    return_str=True,
) as pm:
    folder1_path:str = sub_path.folder1
    folder2_path:str = sub_path.folder2
    do_sth(folder1_path, folder2_path)
    train()

✨Fine-Grained Config

Refer to YOLO-style configuration to enable fine-grained control over model architecture."

Firstly, we need to add more some information to the registered class that we want to configure fine-grainedly.

from excore import Registry

MODEL = Registry("Model", extra_field=["receive", "send"])
MODEL.register_module(nn.Conv2d, receive="in_channels", send="out_channels")
MODEL.register_module(nn.BatchNorm2d, receive="num_features", send="num_features")

receive should be a str or a list of str, which contains the parameter names during the passing procedures. So is send.

Secondly, enable the fine-grained config by

from excore.plugins.finegrained_config import enable_finegrained_config

enable_finegrained_config()

Thirdly, use * to define the finegrained_config. There are 3 required parameters:

$class_mapping: List of class names to be used
info: List of [repeat_times, module_index] for each layer
args: Arguments list for each layer's initialization

class_mapping = ['Conv2d', 'BatchNorm2d']
[Backbone.FinegrainedModel]
$backbone = "torch.nn.Sequential*FinegrainedConfig"

[FinegrainedConfig]
$class_mapping = "&class_mapping"
# [from, number, module idx]
info = [
  [1, 0],
  [3, 0],
  [1, 1],
  [2, 0],
  [1, 1],
]
args = [
  [3],
  [32, 3],
  [64, 3],
  [128],
  [224, 1],
  [224],
]

$backbone = "torch.nn.Sequential*$ConfigInfo::backbone" where torch.nn.Sequential is the wrapper of the results of FinegrainedConfig. *FinegrainedConfig means apply the finegrained_config hook and get its initialize parameters from dict FinegrainedConfig.

The finegrained_config will pass the parameters according to receive and send names of each class.

Finally the backbone will be:

Sequential(
  (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1))
  (1): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
  (2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
  (3): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
  (4): BatchNorm2d(64, eps=128, momentum=0.1, affine=True, track_running_stats=True)
  (5): Conv2d(64, 224, kernel_size=(1, 1), stride=(1, 1))
  (6): Conv2d(64, 224, kernel_size=(1, 1), stride=(1, 1))
  (7): BatchNorm2d(224, eps=224, momentum=0.1, affine=True, track_running_stats=True)
)

RoadMap

For more features you may refer to Roadmap of ExCore

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
.github/workflows		.github/workflows
docs		docs
example		example
excore		excore
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
README_cn.md		README_cn.md
codecov.yaml		codecov.yaml
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ExCore

Features

Config System

Registry

Plugins

RoadMap

About

Uh oh!

Releases

Uh oh!

Languages

License

Asthestarsfalll/ExCore

Folders and files

Latest commit

History

Repository files navigation

ExCore

Features

Config System

Registry

Plugins

RoadMap

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Languages