Arcmap
FiniteStateTransducers.arcmap
— Functionarcmap(f::Function, fst::WFST, args...; modify_initials=identity, modify_finals=identity, isym=get_isym(fst), osym=get_osym(fst))
Creates a new WFST from fst
by applying the function f
to all its arcs, see Arc
. The arguments of f
can be specified in args
.
The functions modify_initials
and modify_finals
operate in the initial and final dictionaries, which keys are the initial/final states and values are the corresponding state weights.
Input and output tables can also be modified using the keywords isym
and osym
.
The following example shows how to use arcmap
to convert the type of weight of a WFST:
julia> A = WFST(["a","b","c"]); # by default weight is TropicalWeight{Float32}
julia> add_arc!(A,1=>2,"a"=>"a",1);
julia> add_arc!(A,1=>3,"b"=>"c",3);
julia> initial!(A,1); final!(A,2,4); final!(A,3,2)
WFST #states: 3, #arcs: 2, #isym: 3, #osym: 3
|1/0.0f0|
a:a/3.0f0 → (2)
b:c/3.0f0 → (3)
((2/4.0f0))
((3/2.0f0))
julia> trop2prob(x) = ProbabilityWeight{Float64}(exp(-get(x)))
trop2prob (generic function with 1 method)
julia> function trop2prob(arc::Arc)
ilab = get_ilabel(arc)
olab = get_olabel(arc)
w = trop2prob(get_weight(arc))
n = get_nextstate(arc)
return Arc(ilab,olab,w,n)
end
trop2prob (generic function with 2 methods)
julia> trop2prob(initials::Dict) = Dict(i => trop2prob(w) for (i,w) in initials)
trop2prob (generic function with 3 methods)
julia> arcmap(trop2prob,A; modify_initials=trop2prob, modify_finals=trop2prob)
WFST #states: 3, #arcs: 2, #isym: 3, #osym: 3
|1/1.0|
a:a/0.3678794503211975 → (2)
b:c/0.049787066876888275 → (3)
((2/0.018315639346837997))
((3/0.1353352814912796))
Base.inv
— Functioninv(fst::WFST)
Inverts fst
such that input labels are swaped with output labels.
FiniteStateTransducers.proj
— Functionproj(iolabel::Function, fst::WFST)
Projects input labels to output labels (or viceversa). The function get_iolabel
should either be the function get_ilabel
or get_olabel
.
Closure
FiniteStateTransducers.closure
— Functionclosure(fst; star=true)
Returns a new WFST cfst
that is the closure of fst
. If fst
transuces labels
to olabel
with weight w
, cfst
will be able to transduce repeat(ilabels,n)
to repeat(olabels,m)
with weight w^n
for any integer n
.
If star
is true
, cfst
will transduce the empty string to itself with weight one.
FiniteStateTransducers.closure!
— Functionclosure!(fst; star=true)
Same as closure
but modifies fst
in-place.
Composition
Base.:∘
— Function∘(A::WFST,B::WFST; filter=EpsSeq, connect=true)
Perform composition of the transucers A
and B
.
The keyword filter
can specify the composition filter to be used.
If connect
is set to true
after completing the composition the connect
algorithm is applied.
See Allauzen et al. "Filters for Efficient Composition of Weighted Finite-State Transducers"
FiniteStateTransducers.Trivial
— TypeTrivial
Simplest composition filter, can be used for epsilon-free WFSTs.
FiniteStateTransducers.EpsMatch
— TypeEpsMatch
Can be used for WFSTs containing epsilon labels. Avoids redundant epsilon paths and giving priority to those that match epsilon labels.
FiniteStateTransducers.EpsSeq
— TypeEpsSeq
Can be used for WFSTs containing epsilon labels. Avoids redundant epsilon paths. Gives priority to epsilon paths consisting of epsilon-arcs in A
followed by epsilon-arcs in B
.
Concatenation
Base.:*
— Function*(A::WFST,B::WFST)
Returns C
, the concatenation/product of A
and B
. If A
converts the sequence a_i
to a_o
with weight wa
and B
converts the sequence b_i
to b_o
with weight wb
then Cconverts the sequence
[ai,bi]to
[ao,bo]with weight
wa*wb`.
julia> sym = ["a","b","c","d","α","β","γ","δ"];
julia> A = WFST(sym);
julia> add_arc!(A,1=>2,"a"=>"α");
julia> add_arc!(A,2=>3,"b"=>"β");
julia> initial!(A,1); final!(A,3,5)
WFST #states: 3, #arcs: 2, #isym: 8, #osym: 8
|1/0.0f0|
a:α/0.0f0 → (2)
(2)
b:β/0.0f0 → (3)
((3/5.0f0))
julia> B = WFST(sym);
julia> add_arc!(B,1=>2,"c"=>"γ");
julia> add_arc!(B,2=>3,"d"=>"δ");
julia> initial!(B,1); final!(B,3,2)
WFST #states: 3, #arcs: 2, #isym: 8, #osym: 8
|1/0.0f0|
c:γ/0.0f0 → (2)
(2)
d:δ/0.0f0 → (3)
((3/2.0f0))
julia> C = A*B
WFST #states: 6, #arcs: 5, #isym: 8, #osym: 8
|1/0.0f0|
a:α/0.0f0 → (2)
(2)
b:β/0.0f0 → (3)
(3)
ϵ:ϵ/5.0f0 → (4)
(4)
c:γ/0.0f0 → (5)
(5)
d:δ/0.0f0 → (6)
((6/2.0f0))
julia> A(["a","b"])
(["α", "β"], 5.0f0)
julia> B(["c","d"])
(["γ", "δ"], 2.0f0)
julia> C(["a","b","c","d"])
(["α", "β", "γ", "δ"], 7.0f0)
Connect
FiniteStateTransducers.connect
— Functionconnect(fst::WFST, [i = get_initial(fst,single=true)])
Remove states that do not belong to paths that end in final states when starting from i
.
FiniteStateTransducers.connect!
— Functionconnect!(fst::WFST, [i = get_initial(fst,single=true)])
Same as connect
but modifies fst
in-place.
Determinize
FiniteStateTransducers.determinize_fsa
— Functiondeterminize_fsa(fsa)
Returns a deterministic finite state acceptor. fsa
must be an acceptor (is_acceptor
).
Requires the WFST to be defined in a left distributive and weakly divisible semiring.
See M. Mohri, F. Pereira, Fernando, M. Riley, Michael "Speech Recognition with Weighted Finite-State Transducers" in Springer Handb. Speech Process. 2008 for details.
Epsilon Removal
FiniteStateTransducers.rm_eps
— Functionrm_eps(fst::WFST)
Returns an equivalent WFST where arcs with input and output labels are removed.
FiniteStateTransducers.rm_eps!
— Functionrm_eps!(fst::WFST)
Same as rm_eps
but operates in place.
Iterators
FiniteStateTransducers.BFS
— TypeBFS(fst,initial)
Returns an iterator that performs a breadth-first search (BFS) of fst
starting from the state initial
.
At each iteartion the tuple (i,p)
is returned, where i
is the current state and p
a Path
.
FiniteStateTransducers.get_paths
— Functionget_paths(fst::FST, [i = get_initial(fst;single=true)])
Returns an iterator that generates all of the possible paths of fst
starting from i
.
Notice that the fst
must be acyclic for the algorithm to stop (see is_acyclic
).
FiniteStateTransducers.collectpaths
— Functioncollectpaths(fst::FST, [i = get_initial(fst;first=true)])
Returns an array containing all of the possible paths of fst
starting from i
.
Notice that the fst
must be acyclic for the algorithm to stop (see is_acyclic
).
FiniteStateTransducers.DFS
— TypeDFS(fst::WFST, initial; filter=arc->true)
Creates an iterator that performs a depth-first search (DFS) of the fst
starting from the state initial
. The function filter can be specify in order to perform the search ignoring arcs with user-defined properties.
For each iterator the tuple (p,s,n,d,e)
is returned, where:
p
is the parent parent state (p==0 if there is no parent)s
is the current staten
next state:n==0
ifs
has no arcs or if completely explored. These will give you standard DFS iterations.d
is aBool
: iftrue
means a new state is discovered, can be used to pick up DFS its (see example below)e
is aBool
, iftrue
means all arcs have been inspecteda
inspected arc (empty ifd == false
)
The following cases/colors are possible:
d == false && n ==0
-> black state, state has been checked completelyd == false && n !=0
->n
is already grayd == true -> n
state is colored gray, i.e. is visited for the first time
Example
julia> fst = WFST(Dict(1=>1));
julia> add_arc!(fst, 1, 2, 1, 1, 1.0);
julia> add_arc!(fst, 1, 5, 1, 1, 1.0);
julia> add_arc!(fst, 1, 3, 1, 1, 1.0);
julia> add_arc!(fst, 2, 6, 1, 1, 1.0);
julia> add_arc!(fst, 2, 4, 1, 1, 1.0);
julia> add_arc!(fst, 5, 4, 1, 1, 1.0);
julia> add_arc!(fst, 3, 4, 1, 1, 1.0);
julia> add_arc!(fst, 3, 7, 1, 1, 1.0);
julia> initial!(fst,1)
WFST #states: 7, #arcs: 8, #isym: 1, #osym: 1
|1/0.0f0|
1:1/1.0f0 → (2)
1:1/1.0f0 → (5)
1:1/1.0f0 → (3)
(2)
1:1/1.0f0 → (6)
1:1/1.0f0 → (4)
(3)
1:1/1.0f0 → (4)
1:1/1.0f0 → (7)
(4)
(5)
1:1/1.0f0 → (4)
(6)
(7)
julia> dfs = FiniteStateTransducers.DFS(fst,1); # DFS starting from state 1
julia> for (p,s,n,d,e,a) in dfs
println("$p,$s,$n")
if d == true
println("visiting first time $n (Gray)")
else
if e
println("completed state $s (Black)")
else
println("node $n already visited")
end
end
end
0,1,2
visiting first time 2 (Gray)
1,2,6
visiting first time 6 (Gray)
2,6,0
completed state 6 (Black)
1,2,4
visiting first time 4 (Gray)
2,4,0
completed state 4 (Black)
1,2,0
completed state 2 (Black)
0,1,5
visiting first time 5 (Gray)
1,5,4
node 4 already visited
1,5,0
completed state 5 (Black)
0,1,3
visiting first time 3 (Gray)
1,3,4
node 4 already visited
1,3,7
visiting first time 7 (Gray)
3,7,0
completed state 7 (Black)
1,3,0
completed state 3 (Black)
0,1,0
completed state 1 (Black)
FiniteStateTransducers.is_visited
— Functionis_visited(fst::WFST, i = get_initial(fst;single=true))
Return an array of booleans indicating if the i-th state of the fst is visted starting form i
.
Reverse
Base.reverse
— Functionreverse(fst::WFST)
Returns rfst
, the reversed version of fst
. If fst
transuces the sequence x
to y
with weight w
, rfst
transuces the reverse of x
to the reverse of y
with weight reverse(w)
.
Shortest Distance
FiniteStateTransducers.shortest_distance
— Functionshortest_distance(fst[, s=get_initial(fst); filter=arc->true, reverse=false)
Computes the shortest distance between the state s
to every other state. The shortest distance between s
and i
is defined as the sum of the weights of all paths between these states.
Weights are required to be rigth distributive and k-closed.
If fst
has N
states a N
-long vector is returned where the i
-the element contains the shortest distance between the states s
and i
.
If reversed==true
the shortest distance from the final states is computed. In this case s
has no effect since a new initial superstate is added to the reversed WFST. Here weights are required to be left distributive and k-closed.
See Mohri "Semiring framework and algorithms for shortest-distance problems", Journal of Automata, Languages and Combinatorics 7(3): 321-350, 2002.
Topological sort
FiniteStateTransducers.topsort
— Functiontopsort(fst::WFST, i = get_initial(fst); filter=arc->true)
Requires fst
to be acyclic.
Returns an equivalent WFST to fst
which is topologically sorted starting from the i
-th state.
Modify the filter
function to perform the operation considering only specific arcs.
FiniteStateTransducers.topsortperm
— Functiontopsortperm(fst::WFST, i = get_initial(fst); filter=arc->true)
Requires fst
to be acyclic.
Returns the topological permutation of states of fst
starting from the i
-th state.
Modify the filter
function to perform the operation considering only specific arcs.
FiniteStateTransducers.get_scc
— Functionget_scc(fst::WFST, i = get_initial(fst;single=true); filter=arc->true)
Calculates the strongly connected components of 'fst' using Tarjan's algorithm. Returns a tuple (scc,c,v)
:
scc
are the strongly connected components offst
c
is a boolean array containing the accessibility form thei
v
are the visited states of thefst
The function filter
can be used to consider only arcs with specific properties.
WFST label transition extraction
FiniteStateTransducers.wfst2tr
— Functionwfst2tr(fst,Nt; get_label=get_ilabel)
Returns an array time2tr
of length Nt-1
. The t
-th element of time2tr
is a dictionary mapping the transition index between input label at time t
to the input label at time t+1
.
julia> using FiniteStateTransducers
julia> H = WFST(["a1","a2","a3"],["a"]);
julia> add_arc!(H,1,2,"a1","a");
julia> add_arc!(H,2,3,"a2","<eps>");
julia> add_arc!(H,2,2,"a2","<eps>");
julia> add_arc!(H,3,4,"a3","<eps>");
julia> add_arc!(H,3,2,"a3","a");
julia> initial!(H,1);
julia> final!(H,4)
WFST #states: 4, #arcs: 5, #isym: 3, #osym: 1
|1/0.0f0|
a1:a/0.0f0 → (2)
(2)
a2:ϵ/0.0f0 → (3)
a2:ϵ/0.0f0 → (2)
(3)
a3:ϵ/0.0f0 → (4)
a3:a/0.0f0 → (2)
((4/0.0f0))
julia> Nt = 10;
julia> time2tr = wfst2tr(H,Nt)
9-element Array{Dict{Int64,Array{Int64,1}},1}:
Dict(1 => [2])
Dict(2 => [2, 3])
Dict(2 => [2, 3],3 => [2])
Dict(2 => [2, 3],3 => [2])
Dict(2 => [2, 3],3 => [2])
Dict(2 => [2, 3],3 => [2])
Dict(2 => [2, 3],3 => [2])
Dict(2 => [2],3 => [2])
Dict(2 => [3])