Browse Source

[DOC] complete 'Use ABL-Package Step by Step'

pull/1/head
Gao Enhao 1 year ago
parent
commit
b5f6cc857b
1 changed files with 165 additions and 2 deletions
  1. +165
    -2
      docs/Brief-Introduction/Usage.rst

+ 165
- 2
docs/Brief-Introduction/Usage.rst View File

@@ -1,5 +1,168 @@
Use ABL-Package Step by Step
==================
=============================

.. contents:: Table of Contents
Using ABL-Package for your learning tasks contains five steps

- Build the reasoning part
- Build the machine learning part
- Build datasets and evaluation metrics
- Bridge the reasoning and machine learning parts
- Use ``Bridge.train`` and ``Bridge.test`` to train and test

Build the reasoning part
~~~~~~~~~~~~~~~~~~~~~~~~

In ABL-Package, the reasoning part is wrapped in the ``ReasonerBase`` class. In order to create an instance of this class, we first need to inherit the ``KBBase`` class to customize our knowledge base. Arguments of the ``__init__`` method of the knowledge base should at least contain ``pseudo_label_list`` which is a list of all pseudo labels. The ``logic_forward`` method of ``KBBase`` is an abstract method and we need to instantiate this method in our sub-class to give the ability of deduction to the knowledge base. In general, we can customize our knowledge base by

.. code:: python

class MyKB(KBBase):
def __init__(self, pseudo_label_list):
super().__init__(pseudo_label_list)
def logic_forward(self, *args, **kwargs):
# Deduction implementation...
return deduction_result

Aside from the knowledge base, the instantiation of the ``ReasonerBase`` also needs to set an extra argument called ``dist_func``, which is the consistency measure used to select the best candidate from all candidates. In general, we can instantiate our reasoner by

.. code:: python

kb = MyKB(pseudo_label_list)
reasoner = ReasonerBase(kb, dist_func="hamming")

In the MNIST Add example, the reasoner looks like

.. code:: python

class AddKB(KBBase):
def __init__(self, pseudo_label_list):
super().__init__(pseudo_label_list)

# Implement the deduction function
def logic_forward(self, nums):
return sum(nums)

kb = AddKB(pseudo_label_list=list(range(10)))
reasoner = ReasonerBase(kb, dist_func="confidence")

Build the machine learning part
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Next, we build the machine learning part, which needs to be wrapped in the ``ABLModel`` class. We can use machine learning models from scikit-learn or based on PyTorch to create an instance of ``ABLModel``.

- for a scikit-learn model, we can directly use the model to create an instance of ``ABLModel``. For example, we can customize our machine learning model by

.. code:: python

# Load a scikit-learn model
base_model = sklearn.neighbors.KNeighborsClassifier(n_neighbors=3)

model = ABLModel(base_model)

- for a PyTorch-based neural network, we first need to encapsulate it within a ``BasicNN`` object and then use this object to instantiate an instance of ``ABLModel``. For example, we can customize our machine learning model by

.. code:: python

# Load a PyTorch-based neural network
cls = torchvision.models.resnet18(pretrained=True)

# criterion and optimizer are used for training
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cls.parameters())

base_model = BasicNN(cls, criterion, optimizer)
model = ABLModel(base_model)


In the MNIST Add example, the machine learning model looks like

.. code:: python

cls = LeNet5(num_classes=len(kb.pseudo_label_list))
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cls.parameters(), lr=0.001, betas=(0.9, 0.99))

base_model = BasicNN(
cls,
criterion,
optimizer,
device=device,
batch_size=32,
num_epochs=1,
)
model = ABLModel(base_model)

Build datasets and evaluation metrics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Next, we need to build datasets and evaluation metrics for training and validation. ABL-Package assumes data to be in the form of ``(X, gt_pseudo_label, Y)`` where ``X`` is the input of the machine learning model, ``Y`` is the ground truth of the reasoning result and ``gt_pseudo_label`` is the ground truth label of each element in ``X``. ``X`` should be of type ``List[List[Any]]``, ``Y`` should be of type ``List[Any]`` and ``gt_pseudo_label`` can be ``None`` or of the type ``List[List[Any]]``.

In the MNIST Add example, the data loading looks like

.. code:: python

# train_data and test_data are all tuples consist of X, gt_pseudo_label and Y.
train_data = get_mnist_add(train=True, get_pseudo_label=True)
test_data = get_mnist_add(train=False, get_pseudo_label=True)

To validate and test the model, we need to inherit from ``BaseMetric`` to define metrics and implement the ``process`` and ``compute_metrics`` methods where the process method accepts a batch of outputs. After processing this batch of data, we save the information to ``self.results`` property. The input results of ``compute_metrics`` is all the information saved in ``process``. Use these information to calculate and return a dict that holds the results of the evaluation metrics.

We provide two basic metrics, namely ``SymbolMetric`` and ``SemanticsMetric``, which are used to evaluate the accuracy of the machine learning model's predictions and the accuracy of the ``logic_forward`` results, respectively.

In the case of MNIST Add example, the metric definition looks like

.. code:: python

metric_list = [SymbolMetric(prefix="mnist_add"), SemanticsMetric(kb=kb, prefix="mnist_add")]

Bridge the reasoning and machine learning parts
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We next need to bridge the reasoning and machine learning parts. In ABL-Package, the ``BaseBridge`` class gives necessary abstract interface definitions to bridge the two parts and ``SimpleBridge`` provides a basic implementation.
We build a bridge with previously defined ``reasoner``, ``model``, and ``metric_list`` as follows:

.. code:: python

bridge = SimpleBridge(model, reasoner, metric_list)

In the MNIST Add example, the bridge creation looks the same.

Use ``Bridge.train`` and ``Bridge.test`` to train and test
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``BaseBridge.train`` and ``BaseBridge.test`` trigger the training and testing processes, respectively.

The two methods take the previous prepared ``train_data`` and ``test_data`` as input.

.. code:: python

bridge.train(train_data)
bridge.test(test_data)

Aside from data, ``BaseBridge.train`` can also take some other training configs shown as follows:

.. code:: python

bridge.train(
# training data
train_data,
# number of Abductive Learning loops
loops=5,
# data will be divided into segments and each segment will be used to train the model iteratively
segment_size=10000,
# evaluate the model every eval_interval loops
eval_interval=1,
# save the model every save_interval loops
save_interval=1,
# directory to save the model
save_dir='./save_dir',
)

In the MNIST Add example, the code to train and test looks like

.. code:: python

bridge.train(train_data, loops=5, segment_size=10000, save_interval=1, save_dir=weights_dir)
bridge.test(test_data)

Loading…
Cancel
Save