(Translated by https://www.hiragana.jp/)
Question: How to load weight from the local file? · Issue #11 · SapienzaNLP/relik · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: How to load weight from the local file? #11

Closed
PluseLin opened this issue Sep 5, 2024 · 2 comments
Closed

Question: How to load weight from the local file? #11

PluseLin opened this issue Sep 5, 2024 · 2 comments

Comments

@PluseLin
Copy link

PluseLin commented Sep 5, 2024

In the README, the Relik is initialized by the method from_pretrained, which is similar to the Huggingface transformers. I'd like to download weights and load it from the local file, just like what I do with BERT. What should I download from Huggingface and how to load it? Thanks.

@Riccorl
Copy link
Collaborator

Riccorl commented Sep 12, 2024

The from_pretrained method in various ReLiK components can accept either a HuggingFace Hub repository ID or a path to local files containing the model's weights and configuration. To load a complete ReLiK pipeline locally using Relik.from_pretrained, you need to download the following components: Index, Retriever, and Reader.

For example, if you want to load the sapienzanlp/relik-entity-linking-small pipeline locally, you’ll first need to reference its original configuration the original file). Here’s a snippet of the configuration file for that pipeline:

_target_: relik.inference.annotator.Relik
retriever:
  span:
    _target_: relik.retriever.pytorch_modules.model.GoldenRetriever
    question_encoder: riccorl/retriever-relik-e5-small-entity-linking-aida-wikipedia-question-encoder
index:
  span:
    _target_: relik.retriever.indexers.inmemory.InMemoryDocumentIndex.from_pretrained
    name_or_path: riccorl/retriever-relik-e5-small-entity-linking-aida-wikipedia-index
reader:
  _target_: relik.reader.pytorch_modules.span.RelikReaderForSpanExtraction
  transformer_model: sapienzanlp/relik-reader-deberta-v3-base-aida
  use_nme: true
task: SPAN
metadata_fields: []
top_k: 100
window_size: 32
window_stride: 16

In this example, the configuration refers to models hosted on the HuggingFace Hub. If you'd like to use local files instead, you can simply replace the HuggingFace repository IDs with the paths to your local directories. Here's what that would look like:

_target_: relik.inference.annotator.Relik
retriever:
  span:
    _target_: relik.retriever.pytorch_modules.model.GoldenRetriever
    question_encoder: /your/local/drive/relik/retriever-relik-e5-small-entity-linking-aida-wikipedia-question-encoder
index:
  span:
    _target_: relik.retriever.indexers.inmemory.InMemoryDocumentIndex.from_pretrained
    name_or_path:  /your/local/drive/relik/retriever-relik-e5-small-entity-linking-aida-wikipedia-index
reader:
  _target_: relik.reader.pytorch_modules.span.RelikReaderForSpanExtraction
  transformer_model:  /your/local/drive/relik/relik-reader-deberta-v3-base-aida
  use_nme: true
task: SPAN
metadata_fields: []
top_k: 100
window_size: 32
window_stride: 16

To download each of these models, you can use the huggingface-cli command line interface. For example:

huggingface-cli download riccorl/retriever-relik-e5-small-entity-linking-aida-wikipedia-question-encoder --repo-type model --local-dir /your/local/path/relik/retriever-relik-e5-small-entity-linking-aida-wikipedia-question-encoder

Repeat this process for the other components (retriever, index, and reader) as needed. This approach allows you to load the entire ReLiK pipeline from local files rather than the HuggingFace Hub.

from relik import Relik

# note that Relik.from_pretrained wants a folder containing the 
# config.yaml file in input, not the file itself!
relik = Relik.from_pretrained("/your/local/drive/relik/relik-entity-linking-small-local")

Hope this helps!

@PluseLin
Copy link
Author

The from_pretrained method in various ReLiK components can accept either a HuggingFace Hub repository ID or a path to local files containing the model's weights and configuration. To load a complete ReLiK pipeline locally using Relik.from_pretrained, you need to download the following components: Index, Retriever, and Reader.

For example, if you want to load the sapienzanlp/relik-entity-linking-small pipeline locally, you’ll first need to reference its original configuration the original file). Here’s a snippet of the configuration file for that pipeline:

_target_: relik.inference.annotator.Relik
retriever:
  span:
    _target_: relik.retriever.pytorch_modules.model.GoldenRetriever
    question_encoder: riccorl/retriever-relik-e5-small-entity-linking-aida-wikipedia-question-encoder
index:
  span:
    _target_: relik.retriever.indexers.inmemory.InMemoryDocumentIndex.from_pretrained
    name_or_path: riccorl/retriever-relik-e5-small-entity-linking-aida-wikipedia-index
reader:
  _target_: relik.reader.pytorch_modules.span.RelikReaderForSpanExtraction
  transformer_model: sapienzanlp/relik-reader-deberta-v3-base-aida
  use_nme: true
task: SPAN
metadata_fields: []
top_k: 100
window_size: 32
window_stride: 16

In this example, the configuration refers to models hosted on the HuggingFace Hub. If you'd like to use local files instead, you can simply replace the HuggingFace repository IDs with the paths to your local directories. Here's what that would look like:

_target_: relik.inference.annotator.Relik
retriever:
  span:
    _target_: relik.retriever.pytorch_modules.model.GoldenRetriever
    question_encoder: /your/local/drive/relik/retriever-relik-e5-small-entity-linking-aida-wikipedia-question-encoder
index:
  span:
    _target_: relik.retriever.indexers.inmemory.InMemoryDocumentIndex.from_pretrained
    name_or_path:  /your/local/drive/relik/retriever-relik-e5-small-entity-linking-aida-wikipedia-index
reader:
  _target_: relik.reader.pytorch_modules.span.RelikReaderForSpanExtraction
  transformer_model:  /your/local/drive/relik/relik-reader-deberta-v3-base-aida
  use_nme: true
task: SPAN
metadata_fields: []
top_k: 100
window_size: 32
window_stride: 16

To download each of these models, you can use the huggingface-cli command line interface. For example:

huggingface-cli download riccorl/retriever-relik-e5-small-entity-linking-aida-wikipedia-question-encoder --repo-type model --local-dir /your/local/path/relik/retriever-relik-e5-small-entity-linking-aida-wikipedia-question-encoder

Repeat this process for the other components (retriever, index, and reader) as needed. This approach allows you to load the entire ReLiK pipeline from local files rather than the HuggingFace Hub.

from relik import Relik

# note that Relik.from_pretrained wants a folder containing the 
# config.yaml file in input, not the file itself!
relik = Relik.from_pretrained("/your/local/drive/relik/relik-entity-linking-small-local")

Hope this helps!

I will try it. Thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants