Arcmap

FiniteStateTransducers.arcmapFunction

arcmap(f::Function, fst::WFST, args...; modify_initials=identity, modify_finals=identity, isym=get_isym(fst), osym=get_osym(fst))

Creates a new WFST from fst by applying the function f to all its arcs, see Arc. The arguments of f can be specified in args.

The functions modify_initials and modify_finals operate in the initial and final dictionaries, which keys are the initial/final states and values are the corresponding state weights.

Input and output tables can also be modified using the keywords isym and osym.

The following example shows how to use arcmap to convert the type of weight of a WFST:

julia> A = WFST(["a","b","c"]); # by default weight is TropicalWeight{Float32}

julia> add_arc!(A,1=>2,"a"=>"a",1);

julia> add_arc!(A,1=>3,"b"=>"c",3);

julia> initial!(A,1); final!(A,2,4); final!(A,3,2)
WFST #states: 3, #arcs: 2, #isym: 3, #osym: 3
|1/0.0f0|
a:a/3.0f0 → (2)
b:c/3.0f0 → (3)
((2/4.0f0))
((3/2.0f0))

julia> trop2prob(x) = ProbabilityWeight{Float64}(exp(-get(x)))
trop2prob (generic function with 1 method)

julia> function trop2prob(arc::Arc)
           ilab = get_ilabel(arc)
           olab = get_olabel(arc)
           w = trop2prob(get_weight(arc))
           n = get_nextstate(arc)
           return Arc(ilab,olab,w,n)
       end
trop2prob (generic function with 2 methods)

julia> trop2prob(initials::Dict) = Dict(i => trop2prob(w) for (i,w) in initials)
trop2prob (generic function with 3 methods)

julia> arcmap(trop2prob,A; modify_initials=trop2prob, modify_finals=trop2prob)
WFST #states: 3, #arcs: 2, #isym: 3, #osym: 3
|1/1.0|
a:a/0.3678794503211975 → (2)
b:c/0.049787066876888275 → (3)
((2/0.018315639346837997))
((3/0.1353352814912796))
source
Base.invFunction

inv(fst::WFST)

Inverts fst such that input labels are swaped with output labels.

julia> A = linearfst(["a","b","c"],[1,2,3],ones(3))
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/1.0|
a:1/1.0 → (2)
(2)
b:2/1.0 → (3)
(3)
c:3/1.0 → (4)
((4/1.0))


julia> inv(A)
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/1.0|
1:a/1.0 → (2)
(2)
2:b/1.0 → (3)
(3)
3:c/1.0 → (4)
((4/1.0))
source
FiniteStateTransducers.projFunction

proj(get_iolabel::Function, fst::WFST)

Projects input labels to output labels (or viceversa). The function get_iolabel should either be the function get_ilabel or get_olabel.

julia> A = linearfst(["a","b","c"],[1,2,3],ones(3))
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/1.0|
a:1/1.0 → (2)
(2)
b:2/1.0 → (3)
(3)
c:3/1.0 → (4)
((4/1.0))

julia> proj(get_ilabel, A)
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/1.0|
a:a/1.0 → (2)
(2)
b:b/1.0 → (3)
(3)
c:c/1.0 → (4)
((4/1.0))

julia> proj(get_olabel, A)
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/1.0|
1:1/1.0 → (2)
(2)
2:2/1.0 → (3)
(3)
3:3/1.0 → (4)
((4/1.0))
source

Closure

FiniteStateTransducers.closureFunction

closure(fst; star=true)

Returns a new WFST cfst that is the closure of fst. If fst transuces labels to olabel with weight w, cfst will be able to transduce repeat(ilabels,n) to repeat(olabels,m) with weight w^n for any integer n.

If star is true, cfst will transduce the empty string to itself with weight one.

source

Composition

Base.:∘Function

∘(A::WFST,B::WFST; filter=EpsSeq, connect=true)

Perform composition of the transucers A and B.

julia> A = linearfst(["a","b","c"],["α","β","γ"],ones(3))
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/0.0f0|
a:α/1.0f0 → (2)
(2)
b:β/1.0f0 → (3)
(3)
c:γ/1.0f0 → (4)
((4/0.0f0))

julia> B = matrix2wfst(["α","β","γ"],[:w,:x,:z],ones(3,3))
WFST #states: 4, #arcs: 9, #isym: 3, #osym: 3
|1/0.0f0|
α:w/1.0f0 → (2)
β:x/1.0f0 → (2)
γ:z/1.0f0 → (2)
(2)
α:w/1.0f0 → (3)
β:x/1.0f0 → (3)
γ:z/1.0f0 → (3)
(3)
α:w/1.0f0 → (4)
β:x/1.0f0 → (4)
γ:z/1.0f0 → (4)
((4/0.0f0))

julia> A∘B
WFST #states: 4, #arcs: 3, #isym: 3, #osym: 3
|1/0.0f0|
a:w/2.0f0 → (2)
(2)
b:x/2.0f0 → (3)
(3)
c:z/2.0f0 → (4)
((4/0.0f0))

The keyword filter can specify the composition filter to be used, which makes it possible to handle epsilon-transitions. See Allauzen et al. "Filters for Efficient Composition of Weighted Finite-State Transducers".

If connect is set to true after completing the composition the connect algorithm is applied.

source
FiniteStateTransducers.EpsMatchType

EpsMatch

Can be used for WFSTs containing epsilon labels. Avoids redundant epsilon paths and giving priority to those that match epsilon labels.

julia> A = txt2fst("0 1 a a 3.0
       1 2 b <eps> 77.0
       2 3 c <eps> 33.0
       3 4 d d 44.0
       4 90.0
       ",Dict("a"=>1,"b"=>2,"c"=>3,"d"=>4,"e"=>5))
WFST #states: 5, #arcs: 4, #isym: 5, #osym: 5
|1/0.0f0|
a:a/3.0f0 → (2)
(2)
b:ϵ/77.0f0 → (3)
(3)
c:ϵ/33.0f0 → (4)
(4)
d:d/44.0f0 → (5)
((5/90.0f0))

julia> B = txt2fst("0 1 a d 15.0
       1 2 <eps> e 19.0
       2 3 d a 12.0
       3 55.0
       ",Dict("a"=>1,"b"=>2,"c"=>3,"d"=>4,"e"=>5))
WFST #states: 4, #arcs: 3, #isym: 5, #osym: 5
|1/0.0f0|
a:d/15.0f0 → (2)
(2)
ϵ:e/19.0f0 → (3)
(3)
d:a/12.0f0 → (4)
((4/55.0f0))

julia> C = ∘(A,B;filter=EpsMatch)
WFST #states: 5, #arcs: 4, #isym: 5, #osym: 5
|1/0.0f0|
a:d/18.0f0 → (2)
(2)
b:e/96.0f0 → (3)
(3)
c:ϵ/33.0f0 → (4)
(4)
d:a/56.0f0 → 5)
((5/145.0f0))
source
FiniteStateTransducers.EpsSeqType

EpsSeq

Can be used for WFSTs containing epsilon labels. Avoids redundant epsilon paths. Gives priority to epsilon paths consisting of epsilon-arcs in A followed by epsilon-arcs in B.

julia> A = txt2fst("0 1 a a 3.0
       1 2 b <eps> 77.0
       2 3 c <eps> 33.0
       3 4 d d 44.0
       4 90.0
       ",Dict("a"=>1,"b"=>2,"c"=>3,"d"=>4,"e"=>5))
WFST #states: 5, #arcs: 4, #isym: 5, #osym: 5
|1/0.0f0|
a:a/3.0f0 → (2)
(2)
b:ϵ/77.0f0 → (3)
(3)
c:ϵ/33.0f0 → (4)
(4)
d:d/44.0f0 → (5)
((5/90.0f0))

julia> B = txt2fst("0 1 a d 15.0
       1 2 <eps> e 19.0
       2 3 d a 12.0
       3 55.0
       ",Dict("a"=>1,"b"=>2,"c"=>3,"d"=>4,"e"=>5))
WFST #states: 4, #arcs: 3, #isym: 5, #osym: 5
|1/0.0f0|
a:d/15.0f0 → (2)
(2)
ϵ:e/19.0f0 → (3)
(3)
d:a/12.0f0 → (4)
((4/55.0f0))

julia> C = ∘(A,B;filter=EpsSeq)
WFST #states: 6, #arcs: 5, #isym: 5, #osym: 5
|1/0.0f0|
a:d/18.0f0 → (2)
(2)
b:ϵ/77.0f0 → (3)
(3)
c:ϵ/33.0f0 → (4)
(4)
ϵ:e/19.0f0 → (5)
(5)
d:a/56.0f0 → (6)
((6/145.0f0))
source

Concatenation

Base.:*Function

*(A::WFST,B::WFST)

Returns C, the concatenation/product of A and B. If A converts the sequence a_i to a_o with weight wa and B converts the sequence b_i to b_o with weight wb then Cconverts the sequence[ai,bi]to[ao,bo]with weightwa*wb`.

julia> sym = ["a","b","c","d","α","β","γ","δ"];

julia> A = WFST(sym);

julia> add_arc!(A,1=>2,"a"=>"α");

julia> add_arc!(A,2=>3,"b"=>"β");

julia> initial!(A,1); final!(A,3,5)
WFST #states: 3, #arcs: 2, #isym: 8, #osym: 8
|1/0.0f0|
a:α/0.0f0 → (2)
(2)
b:β/0.0f0 → (3)
((3/5.0f0))

julia> B = WFST(sym);

julia> add_arc!(B,1=>2,"c"=>"γ");

julia> add_arc!(B,2=>3,"d"=>"δ");

julia> initial!(B,1); final!(B,3,2)
WFST #states: 3, #arcs: 2, #isym: 8, #osym: 8
|1/0.0f0|
c:γ/0.0f0 → (2)
(2)
d:δ/0.0f0 → (3)
((3/2.0f0))

julia> C = A*B
WFST #states: 6, #arcs: 5, #isym: 8, #osym: 8
|1/0.0f0|
a:α/0.0f0 → (2)
(2)
b:β/0.0f0 → (3)
(3)
ϵ:ϵ/5.0f0 → (4)
(4)
c:γ/0.0f0 → (5)
(5)
d:δ/0.0f0 → (6)
((6/2.0f0))

julia> A(["a","b"])
(["α", "β"], 5.0f0)

julia> B(["c","d"])
(["γ", "δ"], 2.0f0)

julia> C(["a","b","c","d"])
(["α", "β", "γ", "δ"], 7.0f0)
source

Connect

FiniteStateTransducers.connectFunction

connect(fst::WFST, [i = get_initial(fst,single=true)])

Remove states that do not belong to paths that end in final states when starting from i.

source

Determinize

Epsilon Removal

Iterators

FiniteStateTransducers.BFSType

BFS(fst,initial)

Returns an iterator that performs a breadth-first search (BFS) of fst starting from the state initial.

At each iteartion the tuple (i,p) is returned, where i is the current state and p a Path.

source
FiniteStateTransducers.get_pathsFunction

get_paths(fst::FST, [i = get_initial(fst;single=true)])

Returns an iterator that generates all of the possible paths of fst starting from i.

Notice that the fst must be acyclic for the algorithm to stop (see is_acyclic).

source
FiniteStateTransducers.collectpathsFunction

collectpaths(fst::FST, [i = get_initial(fst;first=true)])

Returns an array containing all of the possible paths of fst starting from i.

Notice that the fst must be acyclic for the algorithm to stop (see is_acyclic).

source
FiniteStateTransducers.DFSType

DFS(fst::WFST, initial; filter=arc->true)

Creates an iterator that performs a depth-first search (DFS) of the fst starting from the state initial. The function filter can be specify in order to perform the search ignoring arcs with user-defined properties.

For each iterator the tuple (p,s,n,d,e) is returned, where:

  • p is the parent parent state (p==0 if there is no parent)
  • s is the current state
  • n next state: n==0 if s has no arcs or if completely explored. These will give you standard DFS iterations.
  • d is a Bool: if true means a new state is discovered, can be used to pick up DFS its (see example below)
  • e is a Bool, if true means all arcs have been inspected
  • a inspected arc (empty if d == false)

The following cases/colors are possible:

  • d == false && n ==0 -> black state, state has been checked completely
  • d == false && n !=0 -> n is already gray
  • d == true -> n state is colored gray, i.e. is visited for the first time

Example

julia> fst = WFST(Dict(1=>1));

julia> add_arc!(fst, 1, 2, 1, 1, 1.0);

julia> add_arc!(fst, 1, 5, 1, 1, 1.0);

julia> add_arc!(fst, 1, 3, 1, 1, 1.0);

julia> add_arc!(fst, 2, 6, 1, 1, 1.0);

julia> add_arc!(fst, 2, 4, 1, 1, 1.0);

julia> add_arc!(fst, 5, 4, 1, 1, 1.0);

julia> add_arc!(fst, 3, 4, 1, 1, 1.0);

julia> add_arc!(fst, 3, 7, 1, 1, 1.0);

julia> initial!(fst,1)
WFST #states: 7, #arcs: 8, #isym: 1, #osym: 1
|1/0.0f0|
1:1/1.0f0 → (2)
1:1/1.0f0 → (5)
1:1/1.0f0 → (3)
(2)
1:1/1.0f0 → (6)
1:1/1.0f0 → (4)
(3)
1:1/1.0f0 → (4)
1:1/1.0f0 → (7)
(4)
(5)
1:1/1.0f0 → (4)
(6)
(7)

julia> dfs = FiniteStateTransducers.DFS(fst,1); # DFS starting from state 1

julia> for (p,s,n,d,e,a) in dfs
         println("$p,$s,$n")
         if d == true
           println("visiting first time $n (Gray)")
         else
           if e
             println("completed state $s (Black)")
           else
             println("node $n already visited")
           end
         end
       end
0,1,2
visiting first time 2 (Gray)
1,2,6
visiting first time 6 (Gray)
2,6,0
completed state 6 (Black)
1,2,4
visiting first time 4 (Gray)
2,4,0
completed state 4 (Black)
1,2,0
completed state 2 (Black)
0,1,5
visiting first time 5 (Gray)
1,5,4
node 4 already visited
1,5,0
completed state 5 (Black)
0,1,3
visiting first time 3 (Gray)
1,3,4
node 4 already visited
1,3,7
visiting first time 7 (Gray)
3,7,0
completed state 7 (Black)
1,3,0
completed state 3 (Black)
0,1,0
completed state 1 (Black)
source
FiniteStateTransducers.is_visitedFunction

is_visited(fst::WFST, i = get_initial(fst;single=true))

Return an array of booleans indicating if the i-th state of the fst is visted starting form i.

source

Reverse

Base.reverseFunction

reverse(fst::WFST)

Returns rfst, the reversed version of fst. If fst transuces the sequence x to y with weight w, rfst transuces the reverse of x to the reverse of y with weight reverse(w).

source

Shortest Distance

FiniteStateTransducers.shortest_distanceFunction

shortest_distance(fst[, s=get_initial(fst); filter=arc->true, reverse=false)

Computes the shortest distance between the state s to every other state. The shortest distance between s and i is defined as the sum of the weights of all paths between these states.

Weights are required to be rigth distributive and k-closed.

If fst has N states a N-long vector is returned where the i-the element contains the shortest distance between the states s and i.

If reversed==true the shortest distance from the final states is computed. In this case s has no effect since a new initial superstate is added to the reversed WFST. Here weights are required to be left distributive and k-closed.

See Mohri "Semiring framework and algorithms for shortest-distance problems", Journal of Automata, Languages and Combinatorics 7(3): 321-350, 2002.

source

Topological sort

FiniteStateTransducers.topsortFunction

topsort(fst::WFST, i = get_initial(fst); filter=arc->true)

Requires fst to be acyclic.

Returns an equivalent WFST to fst which is topologically sorted starting from the i-th state.

Modify the filter function to perform the operation considering only specific arcs.

source
FiniteStateTransducers.topsortpermFunction

topsortperm(fst::WFST, i = get_initial(fst); filter=arc->true)

Requires fst to be acyclic.

Returns the topological permutation of states of fst starting from the i-th state.

Modify the filter function to perform the operation considering only specific arcs.

source
FiniteStateTransducers.get_sccFunction

get_scc(fst::WFST, i = get_initial(fst;single=true); filter=arc->true)

Calculates the strongly connected components of 'fst' using Tarjan's algorithm. Returns a tuple (scc,c,v):

  • scc are the strongly connected components of fst
  • c is a boolean array containing the accessibility form the i
  • v are the visited states of the fst

The function filter can be used to consider only arcs with specific properties.

source

WFST label transition extraction

FiniteStateTransducers.wfst2trFunction

wfst2tr(fst,Nt; get_label=get_ilabel)

Returns an array time2tr of length Nt-1. The t-th element of time2tr is a dictionary mapping the transition index between input label at time t to the input label at time t+1.

julia> using FiniteStateTransducers

julia> H = WFST(["a1","a2","a3"],["a"]);

julia> add_arc!(H,1,2,"a1","a");

julia> add_arc!(H,2,3,"a2","<eps>");

julia> add_arc!(H,2,2,"a2","<eps>");

julia> add_arc!(H,3,4,"a3","<eps>");

julia> add_arc!(H,3,2,"a3","a");

julia> initial!(H,1);

julia> final!(H,4)
WFST #states: 4, #arcs: 5, #isym: 3, #osym: 1
|1/0.0f0|
a1:a/0.0f0 → (2)
(2)
a2:ϵ/0.0f0 → (3)
a2:ϵ/0.0f0 → (2)
(3)
a3:ϵ/0.0f0 → (4)
a3:a/0.0f0 → (2)
((4/0.0f0))


julia> Nt = 10;

julia> time2tr = wfst2tr(H,Nt)
9-element Array{Dict{Int64,Array{Int64,1}},1}:
 Dict(1 => [2])
 Dict(2 => [2, 3])
 Dict(2 => [2, 3],3 => [2])
 Dict(2 => [2, 3],3 => [2])
 Dict(2 => [2, 3],3 => [2])
 Dict(2 => [2, 3],3 => [2])
 Dict(2 => [2, 3],3 => [2])
 Dict(2 => [2],3 => [2])
 Dict(2 => [3])
source

wfst2tr(fst; get_label=get_ilabel, convert_weight=identity)

Extract transition matrix and initial probabilities from a WFST representing an HMM model.

julia> add_arc!(H,1,2,"a1","a");

julia> add_arc!(H,2,2,"a1","<eps>",-log(0.3));

julia> add_arc!(H,2,3,"a2","<eps>",-log(0.7));

julia> add_arc!(H,3,3,"a2","<eps>",-log(0.3));

julia> add_arc!(H,3,4,"a3","<eps>",-log(0.7));

julia> add_arc!(H,4,4,"a3","<eps>",-log(0.3));

julia> add_arc!(H,4,2,"a1","a",-log(0.7));

julia> initial!(H,1);

julia> final!(H,4)
WFST #states: 4, #arcs: 12, #isym: 3, #osym: 1
|1/0.0f0|
a1:a/0.0f0 → (2)
a1:a/0.0f0 → (2)
(2)
a2:ϵ/0.0f0 → (3)
a2:ϵ/0.0f0 → (2)
a1:ϵ/1.2039728f0 → (2)
a2:ϵ/0.35667494f0 → (3)
(3)
a3:ϵ/0.0f0 → (4)
a3:a/0.0f0 → (2)
a2:ϵ/1.2039728f0 → (3)
a3:ϵ/0.35667494f0 → (4)
((4/0.0f0))
a3:ϵ/1.2039728f0 → (4)
a1:a/0.35667494f0 → (2)

julia> a,A = wfst2tr(H; convert_weight = w -> exp(-get(w)))
(Float32[1.0, 0.0, 0.0], Float32[0.3 0.7 0.0; 0.0 0.3 0.7; 0.7 0.0 0.3])
source