1

I'm wondering if there is a way to train our own Bert tokenizer instead of using pre-trained tokenizer provided by huggingface?

  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. – Community Jun 20 '23 at 13:43

1 Answers1

1

Yes, of course, it is possible, but that implies that you need to train again the whole model, not only the tokenizer.

Here you can find the steps to do everything with the HuggingFace libraries, including a complete example.

noe
  • 26,410
  • 1
  • 46
  • 76