You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

reuse.rst 8.1 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158
  1. ==========================================
  2. Learnwares Reuse
  3. ==========================================
  4. ``Learnware Reuser`` is a ``Python API`` that offers a variety of convenient tools for learnware reuse. Users can reuse a single learnware, combination of multiple learnwares,
  5. and heterogeneous learnwares using these tools efficiently, thereby saving the laborious time and effort of building models from scratch. There are mainly two types of
  6. reuse tools, based on whether user has gathered a small amount of labeled data beforehand: (1) direct reuse and (2) customized reuse based on labeled data.
  7. .. note::
  8. For detailed explanations of the learnware reusers mentioned below, please refer to `COMPONENTS: All Reuse Methods <../components/learnware.html#all-reuse-methods>`_ .
  9. Homo Reuse
  10. ====================
  11. This part introduces baseline methods for reusing homogeneous learnwares to make predictions on unlabeled data.
  12. Direct reuse of Learnware
  13. --------------------------
  14. - ``JobSelector`` selects different learnwares for different data by training a ``job selector`` classifier. The following code shows how to use it:
  15. .. code:: python
  16. from learnware.reuse import JobSelectorReuser
  17. # learnware_list is the list of searched learnware
  18. reuse_job_selector = JobSelectorReuser(learnware_list=learnware_list)
  19. # test_x is the user's data for prediction
  20. # predict_y is the prediction result of the reused learnwares
  21. predict_y = reuse_job_selector.predict(user_data=test_x)
  22. - ``AveragingReuser`` uses an ensemble method to make predictions. The ``mode`` parameter specifies the specific ensemble method:
  23. .. code:: python
  24. from learnware.reuse import AveragingReuser
  25. # Regression tasks:
  26. # - mode="mean": average the learnware outputs.
  27. # Classification tasks:
  28. # - mode="vote_by_label": majority vote for learnware output labels.
  29. # - mode="vote_by_prob": majority vote for learnware output label probabilities.
  30. reuse_ensemble = AveragingReuser(
  31. learnware_list=learnware_list, mode="vote_by_label"
  32. )
  33. ensemble_predict_y = reuse_ensemble.predict(user_data=test_x)
  34. Reusing Learnware with Labeled Data
  35. ------------------------------------
  36. When users have a small amount of labeled data, they can also adapt/polish the received learnware(s) by reusing them with the labeled data, gaining even better performance.
  37. - ``EnsemblePruningReuser`` selectively ensembles a subset of learnwares to choose the ones that are most suitable for the user's task:
  38. .. code:: python
  39. from learnware.reuse import EnsemblePruningReuser
  40. # mode="regression": Suitable for regression tasks
  41. # mode="classification": Suitable for classification tasks
  42. reuse_ensemble_pruning = EnsemblePruningReuser(
  43. learnware_list=learnware_list, mode="regression"
  44. )
  45. # (val_X, val_y) is the small amount of labeled data
  46. reuse_ensemble_pruning.fit(val_X, val_y)
  47. predict_y = reuse_job_selector.predict(user_data=test_x)
  48. - ``FeatureAugmentReuser`` helps users reuse learnwares by augmenting features. This reuser regards each received learnware as a feature augmentor, taking its output as a new feature and then build a simple model on the augmented feature set(``logistic regression`` for classification tasks and ``ridge regression`` for regression tasks):
  49. .. code:: python
  50. from learnware.reuse import FeatureAugmentReuser
  51. # mode="regression": Suitable for regression tasks
  52. # mode="classification": Suitable for classification tasks
  53. reuse_feature_augment = FeatureAugmentReuser(
  54. learnware_list=learnware_list, mode="regression"
  55. )
  56. # (val_X, val_y) is the small amount of labeled data
  57. reuse_feature_augment.fit(val_X, val_y)
  58. predict_y = reuse_feature_augment.predict(user_data=test_x)
  59. Hetero Reuse
  60. ====================
  61. When heterogeneous learnware search is activated(see `WORKFLOWS: Hetero Search <../workflows/search.html#hetero-search>`_), users would receive heterogeneous learnwares which are identified from the whole "specification world".
  62. Though these recommended learnwares are trained from tasks with different feature/label spaces from the user's task, they can still be helpful and perform well beyond their original purpose.
  63. Normally these learnwares are hard to be used, leave alone polished by users, due to the feature/label space heterogeneity. However with the help of ``HeteroMapAlignLearnware`` class which align heterogeneous learnware
  64. with the user's task, users can easily reuse them with the same set of reuse methods mentioned above.
  65. During the alignment process of heterogeneous learnware, the statistical specifications of the learnware and the user's task ``(user_spec)`` are used for input space alignment,
  66. and a small amount of labeled data ``(val_x, val_y)`` is mandatory to be used for output space alignment. This can be done by the following code:
  67. .. code:: python
  68. from learnware.reuse import HeteroMapAlignLearnware
  69. # mode="regression": For user tasks of regression
  70. # mode="classification": For user tasks of classification
  71. hetero_learnware = HeteroMapAlignLearnware(learnware=leanrware, mode="regression")
  72. hetero_learnware.align(user_spec, val_x, val_y)
  73. # Make predictions using the aligned heterogeneous learnware
  74. predict_y = hetero_learnware.predict(user_data=test_x)
  75. To reuse multiple heterogeneous learnwares,
  76. combine ``HeteroMapAlignLearnware`` with the homogeneous reuse methods ``AveragingReuser`` and ``EnsemblePruningReuser`` mentioned above will do the trick:
  77. .. code:: python
  78. hetero_learnware_list = []
  79. for learnware in learnware_list:
  80. hetero_learnware = HeteroMapAlignLearnware(learnware, mode="regression")
  81. hetero_learnware.align(user_spec, val_x, val_y)
  82. hetero_learnware_list.append(hetero_learnware)
  83. # Reuse multiple heterogeneous learnwares using AveragingReuser
  84. reuse_ensemble = AveragingReuser(learnware_list=hetero_learnware_list, mode="mean")
  85. ensemble_predict_y = reuse_ensemble.predict(user_data=test_x)
  86. # Reuse multiple heterogeneous learnwares using EnsemblePruningReuser
  87. reuse_ensemble = EnsemblePruningReuser(
  88. learnware_list=hetero_learnware_list, mode="regression"
  89. )
  90. reuse_ensemble.fit(val_x, val_y)
  91. ensemble_pruning_predict_y = reuse_ensemble.predict(user_data=test_x)
  92. Reuse with ``Model Container``
  93. ================================
  94. ``Learnware`` package provides ``Model Container`` to build executive environment for learnwares according to their runtime dependent files. The learnware's model will be executed in the containers and its env will be installed and uninstalled automatically.
  95. Run the following codes to try run a learnware with ``Model Container``:
  96. .. code-block:: python
  97. from learnware.learnware import Learnware
  98. with LearnwaresContainer(learnware, mode="conda") as env_container: # Let learnware be instance of Learnware Class, and its input shape is (20, 204)
  99. learnware = env_container.get_learnwares_with_container()[0]
  100. input_array = np.random.random(size=(20, 204))
  101. print(learnware.predict(input_array))
  102. The ``mode`` parameter has two options, each for a specific learnware environment loading method:
  103. - ``'conda'``: Install a separate conda virtual environment for each learnware (automatically deleted after execution); run each learnware independently within its virtual environment.
  104. - ``'docker'``: Install a conda virtual environment inside a Docker container (automatically destroyed after execution); run each learnware independently within the container (requires Docker privileges).
  105. .. note::
  106. It's important to note that the "conda" modes are not secure if there are any malicious learnwares. If the user cannot guarantee the security of the learnware they want to load, it's recommended to use the "docker" mode to load the learnware.