WordPieceTokenizer
ContinuingSubwordPrefix
MaxInputCharsPerWord
Normalizer
PreTokenizer
SpecialTokens
UnknownToken
UnknownTokenId
CountTokens(String, ReadOnlySpan<Char>, EncodeSettings)
Create(Stream, WordPieceOptions)
Create(String, WordPieceOptions)
CreateAsync(Stream, WordPieceOptions, CancellationToken)
CreateAsync(String, WordPieceOptions, CancellationToken)
Decode(IEnumerable<Int32>)
Decode(IEnumerable<Int32>, Boolean)
Decode(IEnumerable<Int32>, Span<Char>, Int32, Int32)
Decode(IEnumerable<Int32>, Span<Char>, Boolean, Int32, Int32)
EncodeToIds(String, ReadOnlySpan<Char>, EncodeSettings)
EncodeToTokens(String, ReadOnlySpan<Char>, EncodeSettings)
GetIndexByTokenCount(String, ReadOnlySpan<Char>, EncodeSettings, Boolean, String, Int32)
netcoreapp2.1
namespace Microsoft.ML.Tokenizers
{
public class WordPieceTokenizer : Tokenizer
{
protected override int CountTokens(string? text, ReadOnlySpan<char> textSpan, EncodeSettings settings);
}
}
.NET | 5.06.07.08.09.010.0 |
---|---|
.NET Core | 2.02.12.23.03.1 |
.NET Framework | 4.6.14.6.24.74.7.14.7.24.84.8.1 |
.NET Standard | 2.02.1 |
Information specific to netcoreapp2.1 | |
Assemblies | Microsoft.ML.Tokenizers , Version=1.0.0.0, PublicKeyToken=cc7b13ffcd2ddd51 Microsoft.ML.Tokenizers , Version=1.0.0.0, PublicKeyToken=cc7b13ffcd2ddd51 |
Referencing | Your project needs a package reference to |
Package | Microsoft.ML.Tokenizers (1.0.2) netstandard2.0 |
Platform Restrictions | This framework does not have platform annotations. |
- Built-in API
- Package-provided API