Arcmap
FiniteStateTransducers.arcmap
— Functionarcmap(f::Function, fst::WFST, args...; modify_initials=identity, modify_finals=identity, isym=get_isym(fst), osym=get_osym(fst))
Creates a new WFST from fst
by applying the function f
to all its arcs, see Arc
. The arguments of f
can be specified in args
.
The functions modify_initials
and modify_finals
operate in the initial and final dictionaries, which keys are the initial/final states and values are the corresponding state weights.
Input and output tables can also be modified using the keywords isym
and osym
.
The following example shows how to use arcmap
to convert the type of weight of a WFST:
julia> A = WFST(["a","b","c"]); # by default weight is TropicalWeight{Float32}
julia> add_arc!(A,1=>2,"a"=>"a",1);
julia> add_arc!(A,1=>3,"b"=>"c",3);
julia> initial!(A,1); final!(A,2,4); final!(A,3,2)
WFST #states: 3, #arcs: 2, #isym: 3, #osym: 3
|1/0.0f0|
a:a/3.0f0 → (2)
b:c/3.0f0 → (3)
((2/4.0f0))
((3/2.0f0))
julia> trop2prob(x) = ProbabilityWeight{Float64}(exp(-get(x)))
trop2prob (generic function with 1 method)
julia> function trop2prob(arc::Arc)
ilab = get_ilabel(arc)
olab = get_olabel(arc)
w = trop2prob(get_weight(arc))
n = get_nextstate(arc)
return Arc(ilab,olab,w,n)
end
trop2prob (generic function with 2 methods)
julia> trop2prob(initials::Dict) = Dict(i => trop2prob(w) for (i,w) in initials)
trop2prob (generic function with 3 methods)
julia> arcmap(trop2prob,A; modify_initials=trop2prob, modify_finals=trop2prob)
WFST #states: 3, #arcs: 2, #isym: 3, #osym: 3
|1/1.0|
a:a/0.3678794503211975 → (2)
b:c/0.049787066876888275 → (3)
((2/0.018315639346837997))
((3/0.1353352814912796))
Base.inv
— Functioninv(fst::WFST)
Inverts fst
such that input labels are swaped with output labels.
julia> A = linearfst(["a","b","c"],[1,2,3],ones(3))
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/1.0|
a:1/1.0 → (2)
(2)
b:2/1.0 → (3)
(3)
c:3/1.0 → (4)
((4/1.0))
julia> inv(A)
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/1.0|
1:a/1.0 → (2)
(2)
2:b/1.0 → (3)
(3)
3:c/1.0 → (4)
((4/1.0))
FiniteStateTransducers.proj
— Functionproj(get_iolabel::Function, fst::WFST)
Projects input labels to output labels (or viceversa). The function get_iolabel
should either be the function get_ilabel
or get_olabel
.
julia> A = linearfst(["a","b","c"],[1,2,3],ones(3))
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/1.0|
a:1/1.0 → (2)
(2)
b:2/1.0 → (3)
(3)
c:3/1.0 → (4)
((4/1.0))
julia> proj(get_ilabel, A)
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/1.0|
a:a/1.0 → (2)
(2)
b:b/1.0 → (3)
(3)
c:c/1.0 → (4)
((4/1.0))
julia> proj(get_olabel, A)
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/1.0|
1:1/1.0 → (2)
(2)
2:2/1.0 → (3)
(3)
3:3/1.0 → (4)
((4/1.0))
Closure
FiniteStateTransducers.closure
— Functionclosure(fst; star=true)
Returns a new WFST cfst
that is the closure of fst
. If fst
transuces labels
to olabel
with weight w
, cfst
will be able to transduce repeat(ilabels,n)
to repeat(olabels,m)
with weight w^n
for any integer n
.
If star
is true
, cfst
will transduce the empty string to itself with weight one.
FiniteStateTransducers.closure!
— Functionclosure!(fst; star=true)
Same as closure
but modifies fst
in-place.
Composition
Base.:∘
— Function∘(A::WFST,B::WFST; filter=EpsSeq, connect=true)
Perform composition of the transucers A
and B
.
julia> A = linearfst(["a","b","c"],["α","β","γ"],ones(3))
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/0.0f0|
a:α/1.0f0 → (2)
(2)
b:β/1.0f0 → (3)
(3)
c:γ/1.0f0 → (4)
((4/0.0f0))
julia> B = matrix2wfst(["α","β","γ"],[:w,:x,:z],ones(3,3))
WFST #states: 4, #arcs: 9, #isym: 3, #osym: 3
|1/0.0f0|
α:w/1.0f0 → (2)
β:x/1.0f0 → (2)
γ:z/1.0f0 → (2)
(2)
α:w/1.0f0 → (3)
β:x/1.0f0 → (3)
γ:z/1.0f0 → (3)
(3)
α:w/1.0f0 → (4)
β:x/1.0f0 → (4)
γ:z/1.0f0 → (4)
((4/0.0f0))
julia> A∘B
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/0.0f0|
a:w/2.0f0 → (2)
(2)
b:x/2.0f0 → (3)
(3)
c:z/2.0f0 → (4)
((4/0.0f0))
The keyword filter
can specify the composition filter to be used, which makes it possible to handle epsilon-transitions. See Allauzen et al. "Filters for Efficient Composition of Weighted Finite-State Transducers".
If connect
is set to true
after completing the composition the connect
algorithm is applied.
FiniteStateTransducers.Trivial
— TypeTrivial
Simplest composition filter, can be used for epsilon-free WFSTs.
FiniteStateTransducers.EpsMatch
— TypeEpsMatch
Can be used for WFSTs containing epsilon labels. Avoids redundant epsilon paths and giving priority to those that match epsilon labels.
julia> A = txt2fst("0 1 a a 3.0
1 2 b <eps> 77.0
2 3 c <eps> 33.0
3 4 d d 44.0
4 90.0
",Dict("a"=>1,"b"=>2,"c"=>3,"d"=>4,"e"=>5))
WFST #states: 5, #arcs: 4, #isym: 5, #osym: 5
|1/0.0f0|
a:a/3.0f0 → (2)
(2)
b:ϵ/77.0f0 → (3)
(3)
c:ϵ/33.0f0 → (4)
(4)
d:d/44.0f0 → (5)
((5/90.0f0))
julia> B = txt2fst("0 1 a d 15.0
1 2 <eps> e 19.0
2 3 d a 12.0
3 55.0
",Dict("a"=>1,"b"=>2,"c"=>3,"d"=>4,"e"=>5))
WFST #states: 4, #arcs: 3, #isym: 5, #osym: 5
|1/0.0f0|
a:d/15.0f0 → (2)
(2)
ϵ:e/19.0f0 → (3)
(3)
d:a/12.0f0 → (4)
((4/55.0f0))
julia> C = ∘(A,B;filter=EpsMatch)
WFST #states: 5, #arcs: 4, #isym: 5, #osym: 5
|1/0.0f0|
a:d/18.0f0 → (2)
(2)
b:e/96.0f0 → (3)
(3)
c:ϵ/33.0f0 → (4)
(4)
d:a/56.0f0 → 5)
((5/145.0f0))
FiniteStateTransducers.EpsSeq
— TypeEpsSeq
Can be used for WFSTs containing epsilon labels. Avoids redundant epsilon paths. Gives priority to epsilon paths consisting of epsilon-arcs in A
followed by epsilon-arcs in B
.
julia> A = txt2fst("0 1 a a 3.0
1 2 b <eps> 77.0
2 3 c <eps> 33.0
3 4 d d 44.0
4 90.0
",Dict("a"=>1,"b"=>2,"c"=>3,"d"=>4,"e"=>5))
WFST #states: 5, #arcs: 4, #isym: 5, #osym: 5
|1/0.0f0|
a:a/3.0f0 → (2)
(2)
b:ϵ/77.0f0 → (3)
(3)
c:ϵ/33.0f0 → (4)
(4)
d:d/44.0f0 → (5)
((5/90.0f0))
julia> B = txt2fst("0 1 a d 15.0
1 2 <eps> e 19.0
2 3 d a 12.0
3 55.0
",Dict("a"=>1,"b"=>2,"c"=>3,"d"=>4,"e"=>5))
WFST #states: 4, #arcs: 3, #isym: 5, #osym: 5
|1/0.0f0|
a:d/15.0f0 → (2)
(2)
ϵ:e/19.0f0 → (3)
(3)
d:a/12.0f0 → (4)
((4/55.0f0))
julia> C = ∘(A,B;filter=EpsSeq)
WFST #states: 6, #arcs: 5, #isym: 5, #osym: 5
|1/0.0f0|
a:d/18.0f0 → (2)
(2)
b:ϵ/77.0f0 → (3)
(3)
c:ϵ/33.0f0 → (4)
(4)
ϵ:e/19.0f0 → (5)
(5)
d:a/56.0f0 → (6)
((6/145.0f0))
Concatenation
Base.:*
— Function*(A::WFST,B::WFST)
Returns C
, the concatenation/product of A
and B
. If A
converts the sequence a_i
to a_o
with weight wa
and B
converts the sequence b_i
to b_o
with weight wb
then Cconverts the sequence
[ai,bi]to
[ao,bo]with weight
wa*wb`.
julia> sym = ["a","b","c","d","α","β","γ","δ"];
julia> A = WFST(sym);
julia> add_arc!(A,1=>2,"a"=>"α");
julia> add_arc!(A,2=>3,"b"=>"β");
julia> initial!(A,1); final!(A,3,5)
WFST #states: 3, #arcs: 2, #isym: 8, #osym: 8
|1/0.0f0|
a:α/0.0f0 → (2)
(2)
b:β/0.0f0 → (3)
((3/5.0f0))
julia> B = WFST(sym);
julia> add_arc!(B,1=>2,"c"=>"γ");
julia> add_arc!(B,2=>3,"d"=>"δ");
julia> initial!(B,1); final!(B,3,2)
WFST #states: 3, #arcs: 2, #isym: 8, #osym: 8
|1/0.0f0|
c:γ/0.0f0 → (2)
(2)
d:δ/0.0f0 → (3)
((3/2.0f0))
julia> C = A*B
WFST #states: 6, #arcs: 5, #isym: 8, #osym: 8
|1/0.0f0|
a:α/0.0f0 → (2)
(2)
b:β/0.0f0 → (3)
(3)
ϵ:ϵ/5.0f0 → (4)
(4)
c:γ/0.0f0 → (5)
(5)
d:δ/0.0f0 → (6)
((6/2.0f0))
julia> A(["a","b"])
(["α", "β"], 5.0f0)
julia> B(["c","d"])
(["γ", "δ"], 2.0f0)
julia> C(["a","b","c","d"])
(["α", "β", "γ", "δ"], 7.0f0)
Connect
FiniteStateTransducers.connect
— Functionconnect(fst::WFST, [i = get_initial(fst,single=true)])
Remove states that do not belong to paths that end in final states when starting from i
.
FiniteStateTransducers.connect!
— Functionconnect!(fst::WFST, [i = get_initial(fst,single=true)])
Same as connect
but modifies fst
in-place.
Determinize
FiniteStateTransducers.determinize_fsa
— Functiondeterminize_fsa(fsa)
Returns a deterministic finite state acceptor. fsa
must be an acceptor (is_acceptor
).
Requires the WFST to be defined in a left distributive and weakly divisible semiring.
See M. Mohri, F. Pereira, Fernando, M. Riley, Michael "Speech Recognition with Weighted Finite-State Transducers" in Springer Handb. Speech Process. 2008 for details.
Epsilon Removal
FiniteStateTransducers.rm_eps
— Functionrm_eps(fst::WFST)
Returns an equivalent WFST where arcs with input and output labels are removed.
FiniteStateTransducers.rm_eps!
— Functionrm_eps!(fst::WFST)
Same as rm_eps
but operates in place.
Iterators
FiniteStateTransducers.BFS
— TypeBFS(fst,initial)
Returns an iterator that performs a breadth-first search (BFS) of fst
starting from the state initial
.
At each iteartion the tuple (i,p)
is returned, where i
is the current state and p
a Path
.
FiniteStateTransducers.get_paths
— Functionget_paths(fst::FST, [i = get_initial(fst;single=true)])
Returns an iterator that generates all of the possible paths of fst
starting from i
.
Notice that the fst
must be acyclic for the algorithm to stop (see is_acyclic
).
FiniteStateTransducers.collectpaths
— Functioncollectpaths(fst::FST, [i = get_initial(fst;first=true)])
Returns an array containing all of the possible paths of fst
starting from i
.
Notice that the fst
must be acyclic for the algorithm to stop (see is_acyclic
).
FiniteStateTransducers.DFS
— TypeDFS(fst::WFST, initial; filter=arc->true)
Creates an iterator that performs a depth-first search (DFS) of the fst
starting from the state initial
. The function filter can be specify in order to perform the search ignoring arcs with user-defined properties.
For each iterator the tuple (p,s,n,d,e)
is returned, where:
p
is the parent parent state (p==0 if there is no parent)s
is the current staten
next state:n==0
ifs
has no arcs or if completely explored. These will give you standard DFS iterations.d
is aBool
: iftrue
means a new state is discovered, can be used to pick up DFS its (see example below)e
is aBool
, iftrue
means all arcs have been inspecteda
inspected arc (empty ifd == false
)
The following cases/colors are possible:
d == false && n ==0
-> black state, state has been checked completelyd == false && n !=0
->n
is already grayd == true -> n
state is colored gray, i.e. is visited for the first time
Example
julia> fst = WFST(Dict(1=>1));
julia> add_arc!(fst, 1, 2, 1, 1, 1.0);
julia> add_arc!(fst, 1, 5, 1, 1, 1.0);
julia> add_arc!(fst, 1, 3, 1, 1, 1.0);
julia> add_arc!(fst, 2, 6, 1, 1, 1.0);
julia> add_arc!(fst, 2, 4, 1, 1, 1.0);
julia> add_arc!(fst, 5, 4, 1, 1, 1.0);
julia> add_arc!(fst, 3, 4, 1, 1, 1.0);
julia> add_arc!(fst, 3, 7, 1, 1, 1.0);
julia> initial!(fst,1)
WFST #states: 7, #arcs: 8, #isym: 1, #osym: 1
|1/0.0f0|
1:1/1.0f0 → (2)
1:1/1.0f0 → (5)
1:1/1.0f0 → (3)
(2)
1:1/1.0f0 → (6)
1:1/1.0f0 → (4)
(3)
1:1/1.0f0 → (4)
1:1/1.0f0 → (7)
(4)
(5)
1:1/1.0f0 → (4)
(6)
(7)
julia> dfs = FiniteStateTransducers.DFS(fst,1); # DFS starting from state 1
julia> for (p,s,n,d,e,a) in dfs
println("$p,$s,$n")
if d == true
println("visiting first time $n (Gray)")
else
if e
println("completed state $s (Black)")
else
println("node $n already visited")
end
end
end
0,1,2
visiting first time 2 (Gray)
1,2,6
visiting first time 6 (Gray)
2,6,0
completed state 6 (Black)
1,2,4
visiting first time 4 (Gray)
2,4,0
completed state 4 (Black)
1,2,0
completed state 2 (Black)
0,1,5
visiting first time 5 (Gray)
1,5,4
node 4 already visited
1,5,0
completed state 5 (Black)
0,1,3
visiting first time 3 (Gray)
1,3,4
node 4 already visited
1,3,7
visiting first time 7 (Gray)
3,7,0
completed state 7 (Black)
1,3,0
completed state 3 (Black)
0,1,0
completed state 1 (Black)
FiniteStateTransducers.is_visited
— Functionis_visited(fst::WFST, i = get_initial(fst;single=true))
Return an array of booleans indicating if the i-th state of the fst is visted starting form i
.
Reverse
Base.reverse
— Functionreverse(fst::WFST)
Returns rfst
, the reversed version of fst
. If fst
transuces the sequence x
to y
with weight w
, rfst
transuces the reverse of x
to the reverse of y
with weight reverse(w)
.
Shortest Distance
FiniteStateTransducers.shortest_distance
— Functionshortest_distance(fst[, s=get_initial(fst); filter=arc->true, reverse=false)
Computes the shortest distance between the state s
to every other state. The shortest distance between s
and i
is defined as the sum of the weights of all paths between these states.
Weights are required to be rigth distributive and k-closed.
If fst
has N
states a N
-long vector is returned where the i
-the element contains the shortest distance between the states s
and i
.
If reversed==true
the shortest distance from the final states is computed. In this case s
has no effect since a new initial superstate is added to the reversed WFST. Here weights are required to be left distributive and k-closed.
See Mohri "Semiring framework and algorithms for shortest-distance problems", Journal of Automata, Languages and Combinatorics 7(3): 321-350, 2002.
Topological sort
FiniteStateTransducers.topsort
— Functiontopsort(fst::WFST, i = get_initial(fst); filter=arc->true)
Requires fst
to be acyclic.
Returns an equivalent WFST to fst
which is topologically sorted starting from the i
-th state.
Modify the filter
function to perform the operation considering only specific arcs.
FiniteStateTransducers.topsortperm
— Functiontopsortperm(fst::WFST, i = get_initial(fst); filter=arc->true)
Requires fst
to be acyclic.
Returns the topological permutation of states of fst
starting from the i
-th state.
Modify the filter
function to perform the operation considering only specific arcs.
FiniteStateTransducers.get_scc
— Functionget_scc(fst::WFST, i = get_initial(fst;single=true); filter=arc->true)
Calculates the strongly connected components of 'fst' using Tarjan's algorithm. Returns a tuple (scc,c,v)
:
scc
are the strongly connected components offst
c
is a boolean array containing the accessibility form thei
v
are the visited states of thefst
The function filter
can be used to consider only arcs with specific properties.
WFST label transition extraction
FiniteStateTransducers.wfst2tr
— Functionwfst2tr(fst,Nt; get_label=get_ilabel)
Returns an array time2tr
of length Nt-1
. The t
-th element of time2tr
is a dictionary mapping the transition index between input label at time t
to the input label at time t+1
.
julia> using FiniteStateTransducers
julia> H = WFST(["a1","a2","a3"],["a"]);
julia> add_arc!(H,1,2,"a1","a");
julia> add_arc!(H,2,3,"a2","<eps>");
julia> add_arc!(H,2,2,"a2","<eps>");
julia> add_arc!(H,3,4,"a3","<eps>");
julia> add_arc!(H,3,2,"a3","a");
julia> initial!(H,1);
julia> final!(H,4)
WFST #states: 4, #arcs: 5, #isym: 3, #osym: 1
|1/0.0f0|
a1:a/0.0f0 → (2)
(2)
a2:ϵ/0.0f0 → (3)
a2:ϵ/0.0f0 → (2)
(3)
a3:ϵ/0.0f0 → (4)
a3:a/0.0f0 → (2)
((4/0.0f0))
julia> Nt = 10;
julia> time2tr = wfst2tr(H,Nt)
9-element Array{Dict{Int64,Array{Int64,1}},1}:
Dict(1 => [2])
Dict(2 => [2, 3])
Dict(2 => [2, 3],3 => [2])
Dict(2 => [2, 3],3 => [2])
Dict(2 => [2, 3],3 => [2])
Dict(2 => [2, 3],3 => [2])
Dict(2 => [2, 3],3 => [2])
Dict(2 => [2],3 => [2])
Dict(2 => [3])
wfst2tr(fst; get_label=get_ilabel, convert_weight=identity)
Extract transition matrix and initial probabilities from a WFST representing an HMM model.
julia> add_arc!(H,1,2,"a1","a");
julia> add_arc!(H,2,2,"a1","<eps>",-log(0.3));
julia> add_arc!(H,2,3,"a2","<eps>",-log(0.7));
julia> add_arc!(H,3,3,"a2","<eps>",-log(0.3));
julia> add_arc!(H,3,4,"a3","<eps>",-log(0.7));
julia> add_arc!(H,4,4,"a3","<eps>",-log(0.3));
julia> add_arc!(H,4,2,"a1","a",-log(0.7));
julia> initial!(H,1);
julia> final!(H,4)
WFST #states: 4, #arcs: 12, #isym: 3, #osym: 1
|1/0.0f0|
a1:a/0.0f0 → (2)
a1:a/0.0f0 → (2)
(2)
a2:ϵ/0.0f0 → (3)
a2:ϵ/0.0f0 → (2)
a1:ϵ/1.2039728f0 → (2)
a2:ϵ/0.35667494f0 → (3)
(3)
a3:ϵ/0.0f0 → (4)
a3:a/0.0f0 → (2)
a2:ϵ/1.2039728f0 → (3)
a3:ϵ/0.35667494f0 → (4)
((4/0.0f0))
a3:ϵ/1.2039728f0 → (4)
a1:a/0.35667494f0 → (2)
julia> a,A = wfst2tr(H; convert_weight = w -> exp(-get(w)))
(Float32[1.0, 0.0, 0.0], Float32[0.3 0.7 0.0; 0.0 0.3 0.7; 0.7 0.0 0.3])