WFSTs internals
Formal definition
Formally a WFSTs over the semiring $\mathbb{W}$ is the tuple $(\mathcal{A},\mathcal{B},Q,I,F,E,\lambda,\rho)$ where:
- $\mathcal{A}$ is the input alphabet (set of input labels)
 - $\mathcal{B}$ is the output alphabet (set of output labels)
 - $Q$ is a set of states (usually integers)
 - $I \subseteq Q$ is the set of initial states
 - $F \subseteq Q$ is the set of final states
 - $E \subseteq Q \times \mathcal{A} \cup \{ \epsilon \} \times \mathcal{B} \cup \{ \epsilon \} \times \mathbb{W} \times Q$ is a set of arcs (transitions) where an element consist of the tuple (starting state,input label, output label, weigth, destination state)
 - $\lambda : I \rightarrow \mathbb{W}$ a function that maps initial states to a weight
 - $\rho : F \rightarrow \mathbb{W}$ a function that maps final states to a weight
 
Constructors and modifiers
FiniteStateTransducers.WFST — TypeWFST(isym, [osym]; W = TropicalWeight{Float32})
Construct an empty Weighted Finite State Transducer (WFST)
isyminput symbol table (can be a dictionary or a vector)osymoutput symbol dictionary (optional)Wweight type
If the symbol table are AbstractDict{I,D}, D must be an integer and I the type of the symbol.
FiniteStateTransducers.add_arc! — Functionadd_arc!(fst, src, dest, ilabel, olabel[, w=one(W)])
add_arc!(fst, srcdest::Pair, ilabelolabel::Pair[, w=one(W)])
Adds an arc from state src to state dest with input label ilabel, output label olabel and weight w. Alternative notation utilizes Pairs.
If w is not provided this defaults to one(W) where W is the weight type of fst.
FiniteStateTransducers.add_states! — Functionadd_states!(fst,N)
Add N empty states to the FST.
FiniteStateTransducers.initial! — Functioninitial!(fst::WFST{W},i, [w=one(W)])
Set the i-th state as a initial state. A weight w can also be provided.
FiniteStateTransducers.final! — Functionfinal!(fst::WFST{W},i, [w=one(W)])
Set the i-th state as a final state. A weight w can also be provided.
FiniteStateTransducers.rm_final! — Functionrm_final!(fst,i)
Remove final state labeling of the i-th state.
FiniteStateTransducers.rm_initial! — Functionrm_initial!(fst,i)
Remove initial state labeling of the state i.
Internal access
FiniteStateTransducers.get_isym — Functionget_isym(fst::WFST)
Returns the input symbol table of fst. The table consists of a dictionary where each element goes from the symbol to the corresponding index.
FiniteStateTransducers.get_iisym — Functionget_iisym(fst::WFST)
Returns the inverted input symbol table.
FiniteStateTransducers.get_osym — Functionget_osym(fst::WFST)
Returns the output symbol table of fst. The table consists of a dictionary where each element goes from the symbol to the corresponding index. 
FiniteStateTransducers.get_iosym — Functionget_iosym(fst::WFST)
Returns the inverted output symbol table.
FiniteStateTransducers.get_states — Functionget_states(fst::WFST)
Returns the states the fst.
FiniteStateTransducers.get_initial — Functionget_initial(fst::WFST, [single=false])
Returns the initial state id of the fst. If single is true the first initial state is returned.
FiniteStateTransducers.get_initial_weight — Functionget_initial_weight(fst::WFST,i)
Returns the weight of initial i-th state of fst. 
FiniteStateTransducers.get_final — Functionget_final(fst::WFST, [single=false])
Returns the final state id of the fst. If single is true the first final state is returned.
FiniteStateTransducers.get_final_weight — Functionget_final_weight(fst::WFST,i)
Returns the weight of i-th final state of fst. 
FiniteStateTransducers.get_ialphabet — Functionget_ialphabet(label::fst)
Returns the alphabet of the input labels of fst. The alphabet is defined as the set of symbols composing the input or output labels.
FiniteStateTransducers.get_oalphabet — Functionget_oalphabet(label::fst)
Returns the alphabet of the output labels of fst. The alphabet is defined as the set of symbols composing the input or output labels.
FiniteStateTransducers.get_arcs — Functionget_arcs(fst::WFST) = ArcIterator(fst)
Returns an iterator that can be used to loop through all of the arcs of fst.
Arcs
FiniteStateTransducers.Arc — TypeArc(ilab::Int,olab::Int,w::W,n::Int)
Constructs an arc with input label ilab, output label olab, weight 'weight' and nextstate n. ilab and olab must be integers consistent with a input/output symbol table of the WFST at use.
Mainly used for internals see add_arc! for a simpler way of adding arcs to a WFST.   
FiniteStateTransducers.get_ilabel — Functionget_ilabel(A::Arc)
Returns the input label index of the arc A.
FiniteStateTransducers.get_olabel — Functionget_olabel(A::Arc)
Returns the input label index of the arc A.
FiniteStateTransducers.get_weight — Functionget_weight(A::Arc)
Returns the weight the arc A.
get_weight(fst::WFST,i)
Returns the weight of the i-th state of fst. If there is no weight nothing is returned.
FiniteStateTransducers.get_nextstate — Functionget_nextstate(A::Arc)
Returns the state that the arc A is pointing to.
Paths
FiniteStateTransducers.Path — TypePath(isym[,osym], iolabel::Pair, w=one(TropicalWeight{Float32}))
Construct a path with input and output symbol table isym and (optional) osym (see WFST).
iolabel is a Pair of vectors.
julia> isym = ["a","b","c","d"];
julia> osym = ["α","β","γ","δ"];
julia> W = ProbabilityWeight{Float32};
julia> p = Path(isym,osym,["a","b","c"] => ["α","β","γ"], one(W))
["a", "b", "c"] → ["α", "β", "γ"], w:1.0f0
The weight of a path of a WFST results from the multiplication ($\otimes$) of the weights of the different arcs that are transversed. See e.g. get_paths to extract paths from a WFST.
julia> p * Path(isym,osym,["d"] => ["δ"], W(0.5))
["a", "b", "c", "d"] → ["α", "β", "γ", "δ"], w:0.5f0
FiniteStateTransducers.get_isequence — Functionget_isequence(p::Path)
Returns the input sequence of the path p.
FiniteStateTransducers.get_osequence — Functionget_osequence(p::Path)
Returns the output sequence of the path p.
Tour of the internals
Let's build a simple WFST and check out its internals:
julia> using FiniteStateTransducers
julia> A = WFST(["hello","world"],[:ciao,:mondo]);
julia> add_arc!(A,1=>2,"hello"=>:ciao);
julia> add_arc!(A,1=>3,"world"=>:mondo);
julia> add_arc!(A,2=>3,"world"=>:mondo);
julia> initial!(A,1);
julia> final!(A,3)
WFST #states: 3, #arcs: 3, #isym: 2, #osym: 2
|1/0.0f0|
hello:ciao/0.0f0 → (2)
world:mondo/0.0f0 → (3)
(2)
world:mondo/0.0f0 → (3)
((3/0.0f0))
For this simple WFST the states consists of an Array of Arrays containing Arc:
julia> get_states(A)
3-element Array{Array{FiniteStateTransducers.Arc{TropicalWeight{Float32},Int64},1},1}:
 [1:1/0.0f0 → (2), 2:2/0.0f0 → (3)]
 [2:2/0.0f0 → (3)]
 []As it can be seen the first state has two arcs, second state only one and the final state none. A state can also be accessed as follows:
julia> A[2]
1-element Array{FiniteStateTransducers.Arc{TropicalWeight{Float32},Int64},1}:
 2:2/0.0f0 → (3)Here the arc's input/output labels are displayed as integers. We would expect world:mondo/0.0f0 → (3) instead of 2:2/0.0f0 → (3).  This is due to fact that, contrary to the formal definition, labels are not stored directly in the arcs but an index is used instead, which corresponds to the input/output symbol table:
julia> get_isym(A)
Dict{String,Int64} with 2 entries:
  "hello" => 1
  "world" => 2
julia> get_osym(A)
Dict{Symbol,Int64} with 2 entries:
  :mondo => 2
  :ciao  => 1Another difference is in the definition of $I$, $F$, $\lambda$ and $\rho$. These are also represented by dictionaries that can be accessed using the functions get_initial and get_final.
julia> get_final(A)
Dict{Int64,TropicalWeight{Float32}} with 1 entry:
  3 => 0.0
Epsilon label
FiniteStateTransducers.iseps — Functioniseps(x)
Checks if x is the epsilon label. Defined only for String and Int, where <eps> and 0 are the epsilon symbols. Extend this method if you want to create WFST with a user defined type.
FiniteStateTransducers.get_eps — Functionget_eps(x::Type)
Returns the epsilon label for a given Type. Defined only for String and Int, where <eps> and 0 are the epsilon symbols. Extend this method if you want to create WFST with a user defined type.
By default 0 indicates the epsilon label which is not present in the symbol table.
Currently the following epsilon symbols are reserved for the following types:
| Type | Label | 
|---|---|
Char | 'ϵ' | 
String | "<eps>" | 
Int | 0 | 
We can define a particular type by extending the functions iseps and get_eps. For example if we want to introduce an epsilon symbol for the type Symbol:
julia> FiniteStateTransducers.iseps(x::Symbol) = x == :eps;
julia> FiniteStateTransducers.get_eps(x::Type{Symbol}) = :eps;
julia> add_arc!(A,3=>3,"world"=>:eps)
WFST #states: 3, #arcs: 5, #isym: 2, #osym: 2
|1/0.0f0|
hello:ciao/0.0f0 → (2)
world:mondo/0.0f0 → (3)
(2)
world:mondo/0.0f0 → (3)
((3/0.0f0))
world:ϵ/0.0f0 → (3)
julia> A[3]
1-element Array{Arc{TropicalWeight{Float32},Int64},1}:
 2:0/0.0f0 → (3)Properties
Base.length — Functionlength(fst::WFST)
Returns the number of states of the fst.
Base.size — Functionsize(fst::WFST,[i])
Returns number of states and total number of arcs in a tuple. If 'i=1' returns the number of states and if i=2 the number of arcs. 
FiniteStateTransducers.isinitial — Functionisinitial(fst::WFST, i)
Check if the i-th state is an initial state.
FiniteStateTransducers.isfinal — Functionisfinal(fst::WFST, i)
Check if the i-th state is an final state.
FiniteStateTransducers.typeofweight — Functiontypeofweight(fst_or_path)
Returns the weight type of a WFST or a path.
FiniteStateTransducers.has_eps — Functionhas_eps([label::Function,] fst::FST)
Check if fst has epsilon transition in either input or output labels. To specify input or output plug in either get_ilabel or get_olabel respectively in label. 
FiniteStateTransducers.count_eps — Functioncount_eps(label::Function, fst::FST)
Returns number of epsilon transition of either input or output labels. To specify input or output plug in ilabel and olabel respectively in label. 
FiniteStateTransducers.is_acceptor — Functionis_acceptor(fst::WFST)
Returns true if fst is an acceptor, i.e. if all arcs have equal input and output labels.
FiniteStateTransducers.is_acyclic — Functionis_acyclic(fst::WFST[,i = get_initial(fst;single=true)]; kwargs...)
Returns the number of states of the fst. For kwargs see DFS.
FiniteStateTransducers.is_deterministic — Functionis_deterministic(fst::WFST; get_label=get_ilabel)
Returns true if fst is deterministic in the input labels. A input label deterministic WFST must have a single initial state and arcs leaving any state do not share the same input label.
Change the keyword get_label to get_olabel in order to check determinism in the output labels.
Other constructors
FiniteStateTransducers.linearfst — Functionlinearfst(ilabels,olabels,w,isym,[osym])
Creates a linear WFST. ilabels, olabels and w are the input labels, output labels and weights respectively which must be vectors of the same size.  Keyword arguments are the same of WFST constructor.
FiniteStateTransducers.matrix2wfst — Functionmatrix2wfst(isym,[osym,] X; W = TropicalWeight{Float32})
Creates a WFST with weight type W and input output tables isym and osym using the matrix X. If X is a matrix of dimensions NsxNt, the resulting fst will have Nt+1 states. Arcs form the t-th state of fst will go only to the t+1-th state and will correspond to the non-zero element of the t-th column of X.  State 1 and Nt+1 are initial and final state, respectively.