<?xml version="1.0" encoding="UTF-8"?>
<?latexml searchpaths="/home/japhy/scienceReplication.artiswrong.com/paper_files/arxiv/2506.09735/latex_extracted"?>
<!--  %% bare˙jrnl˙compsoc.tex --><!--  %% V1.4b --><!--  %% 2015/08/26 --><!--  %% by Michael Shell --><!--  %% See: --><!--  %% http://www.michaelshell.org/ --><!--  %% for current contact information. --><!--  %% --><!--  %% This is a skeleton file demonstrating the use of IEEEtran.cls --><!--  %% (requires IEEEtran.cls version 1.8b or later) with an IEEE --><!--  %% Computer Society journal paper. --><!--  %% --><!--  %% Support sites: --><!--  %% http://www.michaelshell.org/tex/ieeetran/ --><!--  %% http://www.ctan.org/pkg/ieeetran --><!--  %% and --><!--  %% http://www.ieee.org/ --><!--  %%************************************************************************* --><!--  %% Legal Notice: --><!--  %% This code is offered as-is without any warranty either expressed or --><!--  %% implied; without even the implied warranty of MERCHANTABILITY or --><!--  %% FITNESS FOR A PARTICULAR PURPOSE! --><!--  %**** manuscript.tex Line 25 **** --><!--  %% User assumes all risk. --><!--  %% In no event shall the IEEE or any contributor to this code be liable for --><!--  %% any damages or losses, including, but not limited to, incidental, --><!--  %% consequential, or any other damages, resulting from the use or misuse --><!--  %% of any information contained here. --><!--  %% --><!--  %% All comments are the opinions of their respective authors and are not --><!--  %% necessarily endorsed by the IEEE. --><!--  %% --><!--  %% This work is distributed under the LaTeX Project Public License (LPPL) --><!--  %% ( http://www.latex-project.org/ ) version 1.3, and may be freely used, --><!--  %% distributed and modified. A copy of the LPPL, version 1.3, is included --><!--  %% in the base LaTeX documentation of all distributions of LaTeX released --><!--  %% 2003/12/01 or later. --><!--  %% Retain all contribution notices and credits. --><!--  %% ** Modified files should be clearly indicated as such, including  ** --><!--  %% ** renaming them and changing author support contact information. ** --><!--  %%************************************************************************* --><!--  %*** Authors should verify (and, if needed, correct) their LaTeX system  *** --><!--  %*** with the testflow diagnostic prior to trusting their LaTeX platform *** --><!--  %*** with production work. The IEEE’s font choices and paper sizes can   *** --><!--  %*** trigger bugs that do not appear when using other class files.       ***                          *** --><!--  %The testflow support page is at: --><!--  %**** manuscript.tex Line 50 **** --><!--  %http://www.michaelshell.org/tex/testflow/ --><?latexml class="IEEEtran" options="10pt,journal,compsoc"?>
<?latexml package="amsmath,amsfonts"?>
<?latexml package="algorithmic"?>
<?latexml package="algorithm"?>
<?latexml package="array"?>
<?latexml package="subfig" options="caption=false,font=normalsize,labelfont=sf,textfont=sf"?>
<?latexml package="textcomp"?>
<?latexml package="stfloats"?>
<?latexml package="url"?>
<?latexml package="verbatim"?>
<?latexml package="graphicx"?>
<?latexml package="multirow"?>
<?latexml package="booktabs"?>
<?latexml package="chngcntr"?>
<?latexml package="colortbl"?>
<?latexml package="ulem"?>
<?latexml package="hyperref"?>
<?latexml package="threeparttable"?>
<?latexml package="caption"?>
<?latexml package="xparse"?>
<?latexml package="etoolbox"?>
<?latexml package="xcolor"?>
<?latexml package="ragged2e"?>
<?latexml package="titlesec"?>
<?latexml package="svg"?>
<!--  %“usepackage–cite˝ --><?latexml package="bm"?>
<?latexml package="tabularx"?>
<?latexml package="fontenc" options="T1"?>
<?latexml package="soul, color, xcolor"?>
<?latexml RelaxNGSchema="LaTeXML"?>
<?latexml package="cite" options="nocompress"?>
<document xmlns="http://dlmf.nist.gov/LaTeXML" class="ltx_authors_1line">
  <resource src="LaTeXML.css" type="text/css"/>
  <resource src="ltx-article.css" type="text/css"/>
  <resource src="ltx-ulem.css" type="text/css"/>
  <title>MPFNet: A Multi-Prior Fusion Network with a Progressive Training Strategy for Micro-Expression Recognition</title>
  <creator role="author">
    <personname>
Chuang Ma,
Shaokai Zhao,
Dongdong Zhou,
Yu Pei,
Zhiguo Luo,
Liang Xie,
Ye Yan<sup>#</sup>,
Erwei Yin<sup>#</sup>
</personname>
    <contact role="thanks">
E. Yin (yinerwei1985@gmail.com) and Y. Yan (yy_taiic@163.com) are the corresponding authors.
C. Ma, S. Zhao, Y. Pei, Z. Luo, L. Xie, Y. Yan, and E. Yin are with the Defense Innovation Institute, Academy of Military Sciences (AMS) and Intelligent Game and Decision Laboratory, Beijing, China.
Dongdong Zhou is with the School of Computer Science and Technology, Dalian University of Technology, Dalian, China.
</contact>
  </creator>
  <abstract name="Abstract">
<!--  %微表达识别␣（MER）␣是情感计算的一个关键子领域，由于其持续时间短、强度低，比宏观表达识别面临更大的挑战。虽然结合先验知识已被证明可以提高␣MER␣性能，但现有方法主要依赖于简单、单一的先验知识来源，未能充分利用多源信息。本文介绍了多先验融合网络␣（MPFNet），利用渐进式训练策略来优化␣MER␣任务。我们提出了两种互补编码器：通用特征编码器␣（GFE）␣和高级特征编码器␣（AFE），两者都基于具有坐标注意␣（CA）␣机制的充气␣3D␣卷积网络␣（I3D），以提高模型捕获时空和通道特定特征的能力。受发展心理学的启发，我们提出了␣MPFNet␣的两种变体——MPFNet-P␣和␣MPFNet-C——对应于婴儿认知发展的两种基本模式：并行和分层处理。这些变体能够评估整合先验知识的不同策略。大量实验表明，MPFNet␣显著提高了␣MER␣准确性，同时保持了各类别之间的平衡性能，在␣SMIC、CASME␣II␣和␣SAMM␣数据集上分别实现了␣0.811、0.924␣和␣0.857␣的准确率。据我们所知，我们的方法在␣SMIC␣和␣SAMM␣数据集上实现了最先进的性能。源代码位于：\href{https://github.com/Mac0504/MPFNet}{https://github.com/Mac0504/MPFNet}。 
     %Note␣that␣keywords␣are␣not␣normally␣used␣for␣peerreview␣papers.-->    <p>Micro-expression recognition (MER), a critical subfield of affective computing, presents greater challenges than macro-expression recognition due to its brief duration and low intensity. While incorporating prior knowledge has been shown to enhance MER performance, existing methods predominantly rely on simplistic, singular sources of prior knowledge, failing to fully exploit multi-source information. This paper introduces the Multi-Prior Fusion Network (MPFNet), leveraging a progressive training strategy to optimize MER tasks. We propose two complementary encoders: the Generic Feature Encoder (GFE) and the Advanced Feature Encoder (AFE), both based on Inflated 3D ConvNets (I3D) with Coordinate Attention (CA) mechanisms, to improve the model’s ability to capture spatiotemporal and channel-specific features. Inspired by developmental psychology, we present two variants of MPFNet—MPFNet-P and MPFNet-C—corresponding to two fundamental modes of infant cognitive development: parallel and hierarchical processing. These variants enable the evaluation of different strategies for integrating prior knowledge. Extensive experiments demonstrate that MPFNet significantly improves MER accuracy while maintaining balanced performance across categories, achieving accuracies of 0.811, 0.924, and 0.857 on the SMIC, CASME II, and SAMM datasets, respectively. To the best of our knowledge, our approach achieves state-of-the-art performance on the SMIC and SAMM datasets.</p>
  </abstract>
  <keywords>
micro-expression, prior learning, progressive training, meta-learning, attention mechanisms.
</keywords>
  <ERROR class="undefined">\ExplSyntaxOn</ERROR>
<!--  %定义一个逗号分隔的键列表，需要染色的参考文献键放在这里 -->  <ERROR class="undefined">\clist</ERROR>
  <para xml:id="p1">
    <p>_new:N ł_my_color_bibkeys_clist
<ERROR class="undefined">\clist</ERROR>_set:Nn ł_my_color_bibkeys_clist
jurcak200710
<!--  %****␣manuscript.tex␣Line␣100␣**** 
     %定义一个新的条件命令，检查当前键是否在列表中--><ERROR class="undefined">\NewDocumentCommand</ERROR><ERROR class="undefined">\instringTF</ERROR>m m m

<ERROR class="undefined">\clist</ERROR>_if_in:NnTF ł_my_color_bibkeys_clist #1
#2 <!--  %如果在列表中，执行此部分 -->#3 <!--  %如果不在列表中，执行此部分 -->
<ERROR class="undefined">\ExplSyntaxOff</ERROR><!--  %保留原始的␣\bibitem␣命令 
     %重定义␣\bibitem␣命令，根据键值决定颜色
     %如果键在列表中，使用红色
     %如果键不在列表中，使用黑色-->
<!--  %Some␣very␣useful␣LaTeX␣packages␣include: 
     %(uncomment␣the␣ones␣you␣want␣to␣load)
     %***␣MISC␣UTILITY␣PACKAGES␣***
     %****␣manuscript.tex␣Line␣125␣****
     %\usepackage{ifpdf}
     %Heiko␣Oberdiek’s␣ifpdf.sty␣is␣very␣useful␣if␣you␣need␣conditional
     %compilation␣based␣on␣whether␣the␣output␣is␣pdf␣or␣dvi.
     %usage:
     %\ifpdf
     %%␣pdf␣code
     %\else
     %%␣dvi␣code
     %\fi
     %The␣latest␣version␣of␣ifpdf.sty␣can␣be␣obtained␣from:
     %http://www.ctan.org/pkg/ifpdf
     %Also,␣note␣that␣IEEEtran.cls␣V1.7␣and␣later␣provides␣a␣builtin
     %\ifCLASSINFOpdf␣conditional␣that␣works␣the␣same␣way.
     %When␣switching␣from␣latex␣to␣pdflatex␣and␣vice-versa,␣the␣compiler␣may
     %have␣to␣be␣run␣twice␣to␣clear␣warning/error␣messages.
     %***␣CITATION␣PACKAGES␣***
     %IEEE␣Computer␣Society␣needs␣nocompress␣option
     %requires␣cite.sty␣v4.0␣or␣later␣(November␣2003)
     %****␣manuscript.tex␣Line␣150␣****
     %cite.sty␣was␣written␣by␣Donald␣Arseneau
     %V1.6␣and␣later␣of␣IEEEtran␣pre-defines␣the␣format␣of␣the␣cite.sty␣package
     %\cite{}␣output␣to␣follow␣that␣of␣the␣IEEE.␣Loading␣the␣cite␣package␣will
     %result␣in␣citation␣numbers␣being␣automatically␣sorted␣and␣properly
     %"compressed/ranged".␣e.g.,␣[1],␣[9],␣[2],␣[7],␣[5],␣[6]␣without␣using
     %cite.sty␣will␣become␣[1],␣[2],␣[5]__[7],␣[9]␣using␣cite.sty.␣cite.sty’s
     %\cite␣will␣automatically␣add␣leading␣space,␣if␣needed.␣Use␣cite.sty’s
     %noadjust␣option␣(cite.sty␣V3.8␣and␣later)␣if␣you␣want␣to␣turn␣this␣off
     %such␣as␣if␣a␣citation␣ever␣needs␣to␣be␣enclosed␣in␣parenthesis.
     %cite.sty␣is␣already␣installed␣on␣most␣LaTeX␣systems.␣Be␣sure␣and␣use
     %version␣5.0␣(2009-03-20)␣and␣later␣if␣using␣hyperref.sty.
     %The␣latest␣version␣can␣be␣obtained␣at:
     %http://www.ctan.org/pkg/cite
     %The␣documentation␣is␣contained␣in␣the␣cite.sty␣file␣itself.
     %Note␣that␣some␣packages␣require␣special␣options␣to␣format␣as␣the␣Computer
     %Society␣requires.␣In␣particular,␣Computer␣Society␣␣papers␣do␣not␣use
     %compressed␣citation␣ranges␣as␣is␣done␣in␣typical␣IEEE␣papers
     %(e.g.,␣[1]-[4]).␣Instead,␣they␣list␣every␣citation␣separately␣in␣order
     %(e.g.,␣[1],␣[2],␣[3],␣[4]).␣To␣get␣the␣latter␣we␣need␣to␣load␣the␣cite
     %****␣manuscript.tex␣Line␣175␣****
     %package␣with␣the␣nocompress␣option␣which␣is␣supported␣by␣cite.sty␣v4.0
     %and␣later.␣Note␣also␣the␣use␣of␣a␣CLASSOPTION␣conditional␣provided␣by
     %IEEEtran.cls␣V1.7␣and␣later.
     %***␣GRAPHICS␣RELATED␣PACKAGES␣***
     %\usepackage[pdftex]{graphicx}
     %declare␣the␣path(s)␣where␣your␣graphic␣files␣are
     %\graphicspath{{../pdf/}{../jpeg/}}
     %and␣their␣extensions␣so␣you␣won’t␣have␣to␣specify␣these␣with
     %every␣instance␣of␣\includegraphics
     %\DeclareGraphicsExtensions{.pdf,.jpeg,.png}
     %graphicx␣was␣written␣by␣David␣Carlisle␣and␣Sebastian␣Rahtz.␣It␣is
     %required␣if␣you␣want␣graphics,␣photos,␣etc.␣graphicx.sty␣is␣already
     %installed␣on␣most␣LaTeX␣systems.␣The␣latest␣version␣and␣documentation
     %can␣be␣obtained␣at:
     %http://www.ctan.org/pkg/graphicx
     %Another␣good␣source␣of␣documentation␣is␣"Using␣Imported␣Graphics␣in
     %LaTeX2e"␣by␣Keith␣Reckdahl␣which␣can␣be␣found␣at:
     %http://www.ctan.org/pkg/epslatex
     %latex,␣and␣pdflatex␣in␣dvi␣mode,␣support␣graphics␣in␣encapsulated
     %postscript␣(.eps)␣format.␣pdflatex␣in␣pdf␣mode␣supports␣graphics
     %in␣.pdf,␣.jpeg,␣.png␣and␣.mps␣(metapost)␣formats.␣Users␣should␣ensure
     %that␣all␣non-photo␣figures␣use␣a␣vector␣format␣(.eps,␣.pdf,␣.mps)␣and
     %not␣a␣bitmapped␣formats␣(.jpeg,␣.png).␣The␣IEEE␣frowns␣on␣bitmapped␣formats
     %which␣can␣result␣in␣"jaggedy"/blurry␣rendering␣of␣lines␣and␣letters␣as
     %well␣as␣large␣increases␣in␣file␣sizes.
     %You␣can␣find␣documentation␣about␣the␣pdfTeX␣application␣at:
     %http://www.tug.org/applications/pdftex
     %****␣manuscript.tex␣Line␣225␣****
     %***␣MATH␣PACKAGES␣***
     %\usepackage{amsmath}
     %A␣popular␣package␣from␣the␣American␣Mathematical␣Society␣that␣provides
     %many␣useful␣and␣powerful␣commands␣for␣dealing␣with␣mathematics.
     %Note␣that␣the␣amsmath␣package␣sets␣\interdisplaylinepenalty␣to␣10000
     %thus␣preventing␣page␣breaks␣from␣occurring␣within␣multiline␣equations.␣Use:
     %\interdisplaylinepenalty=2500
     %after␣loading␣amsmath␣to␣restore␣such␣page␣breaks␣as␣IEEEtran.cls␣normally
     %does.␣amsmath.sty␣is␣already␣installed␣on␣most␣LaTeX␣systems.␣The␣latest
     %version␣and␣documentation␣can␣be␣obtained␣at:
     %http://www.ctan.org/pkg/amsmath
     %***␣SPECIALIZED␣LIST␣PACKAGES␣***
     %\usepackage{algorithmic}
     %algorithmic.sty␣was␣written␣by␣Peter␣Williams␣and␣Rogerio␣Brito.
     %****␣manuscript.tex␣Line␣250␣****
     %This␣package␣provides␣an␣algorithmic␣environment␣fo␣describing␣algorithms.
     %You␣can␣use␣the␣algorithmic␣environment␣in-text␣or␣within␣a␣figure
     %environment␣to␣provide␣for␣a␣floating␣algorithm.␣Do␣NOT␣use␣the␣algorithm
     %floating␣environment␣provided␣by␣algorithm.sty␣(by␣the␣same␣authors)␣or
     %algorithm2e.sty␣(by␣Christophe␣Fiorio)␣as␣the␣IEEE␣does␣not␣use␣dedicated
     %algorithm␣float␣types␣and␣packages␣that␣provide␣these␣will␣not␣provide
     %correct␣IEEE␣style␣captions.␣The␣latest␣version␣and␣documentation␣of
     %algorithmic.sty␣can␣be␣obtained␣at:
     %http://www.ctan.org/pkg/algorithms
     %Also␣of␣interest␣may␣be␣the␣(relatively␣newer␣and␣more␣customizable)
     %algorithmicx.sty␣package␣by␣Szasz␣Janos:
     %http://www.ctan.org/pkg/algorithmicx
     %***␣ALIGNMENT␣PACKAGES␣***
     %\usepackage{array}
     %Frank␣Mittelbach’s␣and␣David␣Carlisle’s␣array.sty␣patches␣and␣improves
     %the␣standard␣LaTeX2e␣array␣and␣tabular␣environments␣to␣provide␣better
     %appearance␣and␣additional␣user␣controls.␣As␣the␣default␣LaTeX2e␣table
     %generation␣code␣is␣lacking␣to␣the␣point␣of␣almost␣being␣broken␣with
     %respect␣to␣the␣quality␣of␣the␣end␣results,␣all␣users␣are␣strongly
     %advised␣to␣use␣an␣enhanced␣(at␣the␣very␣least␣that␣provided␣by␣array.sty)
     %****␣manuscript.tex␣Line␣275␣****
     %set␣of␣table␣tools.␣array.sty␣is␣already␣installed␣on␣most␣systems.␣The
     %latest␣version␣and␣documentation␣can␣be␣obtained␣at:
     %http://www.ctan.org/pkg/array
     %IEEEtran␣contains␣the␣IEEEeqnarray␣family␣of␣commands␣that␣can␣be␣used␣to
     %generate␣multiline␣equations␣as␣well␣as␣matrices,␣tables,␣etc.,␣of␣high
     %quality.
     %***␣SUBFIGURE␣PACKAGES␣***
     %\ifCLASSOPTIONcompsoc
     %\usepackage[caption=false,font=footnotesize,labelfont=sf,textfont=sf]{subfig}
     %\else
     %\usepackage[caption=false,font=footnotesize]{subfig}
     %\fi
     %subfig.sty,␣written␣by␣Steven␣Douglas␣Cochran,␣is␣the␣modern␣replacement
     %for␣subfigure.sty,␣the␣latter␣of␣which␣is␣no␣longer␣maintained␣and␣is
     %incompatible␣with␣some␣LaTeX␣packages␣including␣fixltx2e.␣However,
     %subfig.sty␣requires␣and␣automatically␣loads␣Axel␣Sommerfeldt’s␣caption.sty
     %which␣will␣override␣IEEEtran.cls’␣handling␣of␣captions␣and␣this␣will␣result
     %in␣non-IEEE␣style␣figure/table␣captions.␣To␣prevent␣this␣problem,␣be␣sure
     %and␣invoke␣subfig.sty’s␣"caption=false"␣package␣option␣(available␣since
     %****␣manuscript.tex␣Line␣300␣****
     %subfig.sty␣version␣1.3,␣2005/06/28)␣as␣this␣is␣will␣preserve␣IEEEtran.cls
     %handling␣of␣captions.
     %Note␣that␣the␣Computer␣Society␣format␣requires␣a␣sans␣serif␣font␣rather
     %than␣the␣serif␣font␣used␣in␣traditional␣IEEE␣formatting␣and␣thus␣the␣need
     %to␣invoke␣different␣subfig.sty␣package␣options␣depending␣on␣whether
     %compsoc␣mode␣has␣been␣enabled.
     %The␣latest␣version␣and␣documentation␣of␣subfig.sty␣can␣be␣obtained␣at:
     %http://www.ctan.org/pkg/subfig
     %***␣FLOAT␣PACKAGES␣***
     %\usepackage{fixltx2e}
     %fixltx2e,␣the␣successor␣to␣the␣earlier␣fix2col.sty,␣was␣written␣by
     %Frank␣Mittelbach␣and␣David␣Carlisle.␣This␣package␣corrects␣a␣few␣problems
     %in␣the␣LaTeX2e␣kernel,␣the␣most␣notable␣of␣which␣is␣that␣in␣current
     %LaTeX2e␣releases,␣the␣ordering␣of␣single␣and␣double␣column␣floats␣is␣not
     %guaranteed␣to␣be␣preserved.␣Thus,␣an␣unpatched␣LaTeX2e␣can␣allow␣a
     %single␣column␣figure␣to␣be␣placed␣prior␣to␣an␣earlier␣double␣column
     %figure.
     %Be␣aware␣that␣LaTeX2e␣kernels␣dated␣2015␣and␣later␣have␣fixltx2e.sty’s
     %corrections␣already␣built␣into␣the␣system␣in␣which␣case␣a␣warning␣will
     %****␣manuscript.tex␣Line␣325␣****
     %be␣issued␣if␣an␣attempt␣is␣made␣to␣load␣fixltx2e.sty␣as␣it␣is␣no␣longer
     %needed.
     %The␣latest␣version␣and␣documentation␣can␣be␣found␣at:
     %http://www.ctan.org/pkg/fixltx2e
     %\usepackage{stfloats}
     %stfloats.sty␣was␣written␣by␣Sigitas␣Tolusis.␣This␣package␣gives␣LaTeX2e
     %the␣ability␣to␣do␣double␣column␣floats␣at␣the␣bottom␣of␣the␣page␣as␣well
     %as␣the␣top.␣(e.g.,␣"\begin{figure*}[!b]"␣is␣not␣normally␣possible␣in
     %LaTeX2e).␣It␣also␣provides␣a␣command:
     %\fnbelowfloat
     %to␣enable␣the␣placement␣of␣footnotes␣below␣bottom␣floats␣(the␣standard
     %LaTeX2e␣kernel␣puts␣them␣above␣bottom␣floats).␣This␣is␣an␣invasive␣package
     %which␣rewrites␣many␣portions␣of␣the␣LaTeX2e␣float␣routines.␣It␣may␣not␣work
     %with␣other␣packages␣that␣modify␣the␣LaTeX2e␣float␣routines.␣The␣latest
     %version␣and␣documentation␣can␣be␣obtained␣at:
     %http://www.ctan.org/pkg/stfloats
     %Do␣not␣use␣the␣stfloats␣baselinefloat␣ability␣as␣the␣IEEE␣does␣not␣allow
     %\baselineskip␣to␣stretch.␣Authors␣submitting␣work␣to␣the␣IEEE␣should␣note
     %that␣the␣IEEE␣rarely␣uses␣double␣column␣equations␣and␣that␣authors␣should␣try
     %to␣avoid␣such␣use.␣Do␣not␣be␣tempted␣to␣use␣the␣cuted.sty␣or␣midfloat.sty
     %packages␣(also␣by␣Sigitas␣Tolusis)␣as␣the␣IEEE␣does␣not␣format␣its␣papers␣in
     %such␣ways.
     %Do␣not␣attempt␣to␣use␣stfloats␣with␣fixltx2e␣as␣they␣are␣incompatible.
     %****␣manuscript.tex␣Line␣350␣****
     %Instead,␣use␣Morten␣Hogholm’a␣dblfloatfix␣which␣combines␣the␣features
     %of␣both␣fixltx2e␣and␣stfloats:
     %\usepackage{dblfloatfix}
     %The␣latest␣version␣can␣be␣found␣at:
     %http://www.ctan.org/pkg/dblfloatfix
     %\ifCLASSOPTIONcaptionsoff
     %\usepackage[nomarkers]{endfloat}
     %\let\MYoriglatexcaption\caption
     %\renewcommand{\caption}[2][\relax]{\MYoriglatexcaption[#2]{#2}}
     %\fi
     %endfloat.sty␣was␣written␣by␣James␣Darrell␣McCauley,␣Jeff␣Goldberg␣and
     %Axel␣Sommerfeldt.␣This␣package␣may␣be␣useful␣when␣used␣in␣conjunction␣with
     %IEEEtran.cls’␣␣captionsoff␣option.␣Some␣IEEE␣journals/societies␣require␣that
     %submissions␣have␣lists␣of␣figures/tables␣at␣the␣end␣of␣the␣paper␣and␣that
     %figures/tables␣without␣any␣captions␣are␣placed␣on␣a␣page␣by␣themselves␣at
     %the␣end␣of␣the␣document.␣If␣needed,␣the␣draftcls␣IEEEtran␣class␣option␣or
     %\CLASSINPUTbaselinestretch␣interface␣can␣be␣used␣to␣increase␣the␣line
     %spacing␣as␣well.␣Be␣sure␣and␣use␣the␣nomarkers␣option␣of␣endfloat␣to
     %prevent␣endfloat␣from␣"marking"␣where␣the␣figures␣would␣have␣been␣placed
     %in␣the␣text.␣The␣two␣hack␣lines␣of␣code␣above␣are␣a␣slight␣modification␣of
     %****␣manuscript.tex␣Line␣375␣****
     %that␣suggested␣by␣in␣the␣endfloat␣docs␣(section␣8.4.1)␣to␣ensure␣that
     %the␣full␣captions␣always␣appear␣in␣the␣list␣of␣figures/tables␣-␣even␣if
     %the␣user␣used␣the␣short␣optional␣argument␣of␣\caption[]{}.
     %IEEE␣papers␣do␣not␣typically␣make␣use␣of␣\caption[]’s␣optional␣argument,
     %so␣this␣should␣not␣be␣an␣issue.␣A␣similar␣trick␣can␣be␣used␣to␣disable
     %captions␣of␣packages␣such␣as␣subfig.sty␣that␣lack␣options␣to␣turn␣off
     %the␣subcaptions:
     %For␣subfig.sty:
     %\let\MYorigsubfloat\subfloat
     %\renewcommand{\subfloat}[2][\relax]{\MYorigsubfloat[]{#2}}
     %However,␣the␣above␣trick␣will␣not␣work␣if␣both␣optional␣arguments␣of
     %the␣\subfloat␣command␣are␣used.␣Furthermore,␣there␣needs␣to␣be␣a
     %description␣of␣each␣subfigure␣*somewhere*␣and␣endfloat␣does␣not␣add
     %subfigure␣captions␣to␣its␣list␣of␣figures.␣Thus,␣the␣best␣approach␣is␣to
     %avoid␣the␣use␣of␣subfigure␣captions␣(many␣IEEE␣journals␣avoid␣them␣anyway)
     %and␣instead␣reference/explain␣all␣the␣subfigures␣within␣the␣main␣caption.
     %The␣latest␣version␣of␣endfloat.sty␣and␣its␣documentation␣can␣obtained␣at:
     %http://www.ctan.org/pkg/endfloat
     %The␣IEEEtran␣\ifCLASSOPTIONcaptionsoff␣conditional␣can␣also␣be␣used
     %later␣in␣the␣document,␣say,␣to␣conditionally␣put␣the␣References␣on␣a
     %page␣by␣themselves.
     %****␣manuscript.tex␣Line␣400␣****
     %***␣PDF,␣URL␣AND␣HYPERLINK␣PACKAGES␣***
     %\usepackage{url}
     %url.sty␣was␣written␣by␣Donald␣Arseneau.␣It␣provides␣better␣support␣for
     %handling␣and␣breaking␣URLs.␣url.sty␣is␣already␣installed␣on␣most␣LaTeX
     %systems.␣The␣latest␣version␣and␣documentation␣can␣be␣obtained␣at:
     %http://www.ctan.org/pkg/url
     %Basically,␣\url{my_url_here}.
     %***␣Do␣not␣adjust␣lengths␣that␣control␣margins,␣column␣widths,␣etc.␣***
     %***␣Do␣not␣use␣packages␣that␣alter␣fonts␣(such␣as␣pslatex).␣␣␣␣␣␣␣␣␣***
     %There␣should␣be␣no␣need␣to␣do␣such␣things␣with␣IEEEtran.cls␣V1.6␣and␣later.
     %(Unless␣specifically␣asked␣to␣do␣so␣by␣the␣journal␣or␣conference␣you␣plan
     %to␣submit␣to,␣of␣course.␣)
     %correct␣bad␣hyphenation␣here-->
<!--  %****␣manuscript.tex␣Line␣425␣**** -->
<!--  %paper␣title 
     %Titles␣are␣generally␣capitalized␣except␣for␣words␣such␣as␣a,␣an,␣and,␣as,
     %at,␣but,␣by,␣for,␣in,␣nor,␣of,␣on,␣or,␣the,␣to␣and␣up,␣which␣are␣usually
     %not␣capitalized␣unless␣they␣are␣the␣first␣or␣last␣word␣of␣the␣title.
     %Linebreaks␣\\␣can␣be␣used␣within␣to␣get␣better␣formatting␣as␣desired.
     %Do␣not␣put␣math␣or␣special␣symbols␣in␣the␣title.
     %\title{Identifying␣Stable␣EEG␣Patterns␣in␣Manipulation␣Task␣for␣Negative␣Emotion␣Recognition}
     %\title{Identifying␣EEG␣Patterns␣in␣Response␣to␣Negative␣Emotional␣Manipulation}
     %\title{Identifying␣EEG␣Patterns␣to␣Negative␣Emotional␣Manipulation}--></p>
  </para>
<!--  %\title{Identifying␣EEG␣Patterns␣to␣Negative␣Emotions␣During␣Manipulation} 
     %author␣names␣and␣IEEE␣memberships
     %note␣positions␣of␣commas␣and␣nonbreaking␣spaces␣(␣~␣)␣LaTeX␣will␣not␣break
     %a␣structure␣at␣a␣~␣so␣this␣keeps␣an␣author’s␣name␣from␣being␣broken␣across
     %two␣lines.
     %use␣\thanks{}␣to␣gain␣access␣to␣the␣first␣footnote␣area
     %a␣separate␣\thanks␣must␣be␣used␣for␣each␣paragraph␣as␣LaTeX2e’s␣\thanks
     %was␣not␣built␣to␣handle␣multiple␣paragraphs
     %****␣manuscript.tex␣Line␣450␣****
     %\IEEEcompsocitemizethanks␣is␣a␣special␣\thanks␣that␣produces␣the␣bulleted
     %lists␣the␣Computer␣Society␣journals␣use␣for␣"first␣footnote"␣author
     %affiliations.␣Use␣\IEEEcompsocthanksitem␣which␣works␣much␣like␣\item
     %for␣each␣affiliation␣group.␣When␣not␣in␣compsoc␣mode,
     %\IEEEcompsocitemizethanks␣becomes␣like␣\thanks␣and
     %\IEEEcompsocthanksitem␣becomes␣a␣line␣break␣with␣idention.␣This
     %facilitates␣dual␣compilation,␣although␣admittedly␣the␣differences␣in␣the
     %desired␣content␣of␣\author␣between␣the␣different␣types␣of␣papers␣makes␣a
     %one-size-fits-all␣approach␣a␣daunting␣prospect.␣For␣instance,␣compsoc
     %journal␣papers␣have␣the␣author␣affiliations␣above␣the␣"Manuscript
     %received␣..."␣␣text␣while␣in␣non-compsoc␣journals␣this␣is␣reversed.␣Sigh.
     %****␣manuscript.tex␣Line␣475␣****
     %&lt;-this␣%␣stops␣a␣space
     %\author{Michael~Shell,~\IEEEmembership{Member,~IEEE,}
     %John~Doe,~\IEEEmembership{Fellow,~OSA,}
     %and~Jane~Doe,~\IEEEmembership{Life~Fellow,~IEEE}%␣&lt;-this␣%␣stops␣a␣space
     %\IEEEcompsocitemizethanks{
     %\IEEEcompsocthanksitem␣M.␣Shell␣was␣with␣the␣Department
     %of␣Electrical␣and␣Computer␣Engineering,␣Georgia␣Institute␣of␣Technology,␣Atlanta,
     %GA,␣30332.\protect\\
     %%␣note␣need␣leading␣\protect␣in␣front␣of␣\\␣to␣get␣a␣newline␣within␣\thanks␣as
     %%␣\\␣is␣fragile␣and␣will␣error,␣could␣use␣\hfil\break␣instead.
     %E-mail:␣see␣http://www.michaelshell.org/contact.html
     %\IEEEcompsocthanksitem␣J.␣Doe␣and␣J.␣Doe␣are␣with␣Anonymous␣University.}%␣&lt;-this␣%␣stops␣an␣unwanted␣space
     %\thanks{Manuscript␣received␣April␣19,␣2005;␣revised␣August␣26,␣2015.}
     %}
     %note␣the␣%␣following␣the␣last␣\IEEEmembership␣and␣also␣\thanks␣-
     %these␣prevent␣an␣unwanted␣space␣from␣occurring␣between␣the␣last␣author␣name
     %and␣the␣end␣of␣the␣author␣line.␣i.e.,␣if␣you␣had␣this:
     %\author{....lastname␣\thanks{...}␣\thanks{...}␣}
     %^__^__^__Do␣not␣want␣these␣spaces!
     %****␣manuscript.tex␣Line␣500␣****
     %a␣space␣would␣be␣appended␣to␣the␣last␣name␣and␣could␣cause␣every␣name␣on␣that
     %line␣to␣be␣shifted␣left␣slightly.␣This␣is␣one␣of␣those␣"LaTeX␣things".␣For
     %instance,␣"\textbf{A}␣\textbf{B}"␣will␣typeset␣as␣"A␣B"␣not␣"AB".␣To␣get
     %"AB"␣then␣you␣have␣to␣do:␣"\textbf{A}\textbf{B}"
     %\thanks␣is␣no␣different␣in␣this␣regard,␣so␣shield␣the␣last␣}␣of␣each␣\thanks
     %that␣ends␣a␣line␣with␣a␣%␣and␣do␣not␣let␣a␣space␣in␣before␣the␣next␣\thanks.
     %Spaces␣after␣\IEEEmembership␣other␣than␣the␣last␣one␣are␣OK␣(and␣needed)␣as
     %you␣are␣supposed␣to␣have␣spaces␣between␣the␣names.␣For␣what␣it␣is␣worth,
     %this␣is␣a␣minor␣point␣as␣most␣people␣would␣not␣even␣notice␣if␣the␣said␣evil
     %space␣somehow␣managed␣to␣creep␣in.
     %The␣paper␣headers
     %The␣only␣time␣the␣second␣header␣will␣appear␣is␣for␣the␣odd␣numbered␣pages
     %after␣the␣title␣page␣when␣using␣the␣twoside␣option.
     %***␣Note␣that␣you␣probably␣will␣NOT␣want␣to␣include␣the␣author’s␣***
     %***␣name␣in␣the␣headers␣of␣peer␣review␣papers.␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣***
     %You␣can␣use␣\ifCLASSOPTIONpeerreview␣for␣conditional␣compilation␣here␣if
     %you␣desire.
     %****␣manuscript.tex␣Line␣525␣****
     %The␣publisher’s␣ID␣mark␣at␣the␣bottom␣of␣the␣page␣is␣less␣important␣with
     %Computer␣Society␣journal␣papers␣as␣those␣publications␣place␣the␣marks
     %outside␣of␣the␣main␣text␣columns␣and,␣therefore,␣unlike␣regular␣IEEE
     %journals,␣the␣available␣text␣space␣is␣not␣reduced␣by␣their␣presence.
     %If␣you␣want␣to␣put␣a␣publisher’s␣ID␣mark␣on␣the␣page␣you␣can␣do␣it␣like
     %this:
     %\IEEEpubid{0000__0000/00\$00.00~\copyright~2015␣IEEE}
     %or␣like␣this␣to␣get␣the␣Computer␣Society␣new␣two␣part␣style.
     %\IEEEpubid{\makebox[\columnwidth]{\hfill␣0000__0000/00/\$00.00~\copyright~2015␣IEEE}%
     %\hspace{\columnsep}\makebox[\columnwidth]{Published␣by␣the␣IEEE␣Computer␣Society\hfill}}
     %Remember,␣if␣you␣use␣this␣you␣must␣call␣\IEEEpubidadjcol␣in␣the␣second
     %column␣for␣its␣text␣to␣clear␣the␣IEEEpubid␣mark␣(Computer␣Society␣jorunal
     %papers␣don’t␣need␣this␣extra␣clearance.)
     %use␣for␣special␣paper␣notices
     %\IEEEspecialpapernotice{(Invited␣Paper)}
     %for␣Computer␣Society␣papers,␣we␣must␣declare␣the␣abstract␣and␣index␣terms
     %PRIOR␣to␣the␣title␣within␣the␣\IEEEtitleabstractindextext␣IEEEtran
     %command␣as␣these␣need␣to␣go␣into␣the␣title␣area␣created␣by␣\maketitle.
     %****␣manuscript.tex␣Line␣550␣****
     %As␣a␣general␣rule,␣do␣not␣put␣math,␣special␣symbols␣or␣citations
     %in␣the␣abstract␣or␣keywords.
     %However,␣traditional␣experiments␣to␣elicit␣emotions␣overlook␣the␣non-emotional␣cognitive␣variables,␣making␣the␣collected␣emotional␣data␣far␣away␣from␣practical␣usage.
     %and␣assess␣the␣stability␣of␣emotional␣EEG␣patterns,␣we␣utilize␣real-time␣emotion␣annotation␣to␣involve␣additional␣cognitive␣variables␣to␣develop␣a␣new␣dataset␣called␣CRED,␣where␣four␣negative␣emotions␣and␣one␣neutral␣emotion␣are␣included.
     %such␣as␣emotion,␣action␣and␣decision␣making
     %make␣the␣title␣area
     %To␣allow␣for␣easy␣dual␣compilation␣without␣having␣to␣reenter␣the
     %****␣manuscript.tex␣Line␣575␣****
     %abstract/keywords␣data,␣the␣\IEEEtitleabstractindextext␣text␣will
     %not␣be␣used␣in␣maketitle,␣but␣will␣appear␣(i.e.,␣to␣be␣"transported")
     %here␣as␣\IEEEdisplaynontitleabstractindextext␣when␣the␣compsoc
     %or␣transmag␣modes␣are␣not␣selected␣&lt;OR&gt;␣if␣conference␣mode␣is␣selected
     %-␣because␣all␣conference␣papers␣position␣the␣abstract␣like␣regular
     %papers␣do.
     %\IEEEdisplaynontitleabstractindextext␣has␣no␣effect␣when␣using
     %compsoc␣or␣transmag␣under␣a␣non-conference␣mode.
     %For␣peer␣review␣papers,␣you␣can␣put␣extra␣information␣on␣the␣cover
     %page␣as␣needed:
     %\ifCLASSOPTIONpeerreview
     %\begin{center}␣\bfseries␣EDICS␣Category:␣3-BBND␣\end{center}
     %\fi
     %For␣peerreview␣papers,␣this␣IEEEtran␣command␣inserts␣a␣page␣break␣and
     %creates␣the␣second␣title.␣It␣will␣be␣ignored␣for␣other␣modes.-->  <section inlist="toc" xml:id="S1">
    <tags>
      <tag>1</tag>
      <tag role="autoref">section 1</tag>
      <tag role="refnum">1</tag>
      <tag role="typerefnum">§1</tag>
    </tags>
    <title><tag close=" ">1</tag><text font="smallcaps">Introduction</text></title>
    <para xml:id="S1.p1">
      <p>Facial expressions play an essential role in conveying human emotions and reflecting psychological states during interpersonal interactions. Micro-expressions (MEs) are involuntary facial expressions that often occur when individuals attempt to suppress or conceal their true emotions, making them crucial for uncovering genuine emotional states <cite class="ltx_citemacro_cite">[<bibref bibrefs="xie2022overview" separator="," yyseparator=","/>]</cite>. Micro-expression recognition (MER) has numerous potential applications, including mental health monitoring, human-computer interaction, and security enforcement. By detecting subtle emotional cues, MER offers valuable insights that can inform decision-making, which has led to increasing interest in the field in recent years<cite class="ltx_citemacro_cite">[<bibref bibrefs="wu2010micro" separator="," yyseparator=","/>]</cite>.
<!--  %****␣manuscript.tex␣Line␣600␣**** 
     %面部表情在传达人类情感和反映人际交往中的心理状态方面具有至关重要的作用。微表情（Micro-expressions,␣MEs）是一种自发的面部表情，通常在个体试图抑制或隐藏真实情感时出现，成为揭示人类真实情绪的重要线索␣\cite{xie2022overview}。微表情识别（Micro-expression␣Recognition,␣MER）在心理健康监测、人机交互、安全执法等多个领域具有广泛的潜在应用。通过挖掘个体的情绪状态，MER能够为决策过程提供宝贵的洞察，因此近年来引起了广泛关注␣\cite{wu2010micro}。--></p>
    </para>
    <figure inlist="lof" labels="LABEL:figs:ME_MaE" placement="t" xml:id="S1.F1">
      <tags>
        <tag>Fig. 1</tag>
        <tag role="autoref">Figure 1</tag>
        <tag role="refnum">1</tag>
        <tag role="typerefnum">Fig. 1</tag>
      </tags>
      <graphics candidates="ME_MaE.pdf" class="ltx_centering" graphic="ME_MaE.pdf" options="width=433.62pt" xml:id="S1.F1.g1"/>
      <toccaption class="ltx_centering"><tag close=" ">1</tag>Comparison of MEs and MaEs. The top and bottom rows show examples of happiness and sadness from the CASME II and CK+ dataset, respectively. White arrows indicate the muscle movement direction of the activated facial action units.</toccaption>
      <caption class="ltx_centering"><tag close=": ">Fig. 1</tag>Comparison of MEs and MaEs. The top and bottom rows show examples of happiness and sadness from the CASME II and CK+ dataset, respectively. White arrows indicate the muscle movement direction of the activated facial action units.</caption>
    </figure>
    <para xml:id="S1.p2">
      <p>Compared to the more apparent macro-expressions (MaEs) observed in daily life, MEs, although also grounded in Ekman’s basic emotion model <cite class="ltx_citemacro_cite">[<bibref bibrefs="tracy2011four" separator="," yyseparator=","/>]</cite> (e.g., happiness, anger, sadness, surprise, fear, and disgust), exhibit notable differences. First, in terms of appearance, MEs involve subtle and rapid muscle movements localized to specific facial regions, typically lasting only between 1/25 and 1/3 of a second <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhao2023facial" separator="," yyseparator=","/>]</cite>. These characteristics make the detection and recognition of MEs considerably more challenging than MaEs <cite class="ltx_citemacro_cite">[<bibref bibrefs="li2022deep" separator="," yyseparator=","/>]</cite> (see Fig. <ref labelref="LABEL:figs:ME_MaE"/>). Second, in facial expression coding, although MEs share similarities with MaEs, their transient and subtle nature necessitates specialized expertise and detailed manual annotation during analysis. Psychologists often use the Facial Action Coding System (FACS) <cite class="ltx_citemacro_cite">[<bibref bibrefs="ekman1978facial" separator="," yyseparator=","/>]</cite> to analyze MEs, but this process is both time-consuming and labor-intensive. Finally, in terms of feature representation learning, while many models successful in MaE recognition—such as Convolutional Neural Networks (CNN) <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhao2021two,li2019micro,wang2024htnet" separator="," yyseparator=","/>]</cite>, Recurrent Neural Networks (RNN) <cite class="ltx_citemacro_cite">[<bibref bibrefs="xia2019spatiotemporal,zhang2025towards" separator="," yyseparator=","/>]</cite>, and Transformers <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhang2022short,li2024micro" separator="," yyseparator=","/>]</cite>—are beginning to be applied to the ME domain, the fleeting, localized, and subtle nature of MEs presents additional challenges for these models. Addressing these challenges requires particular focus on three key areas: optimizing the model’s learning process, tackling issues of data sparsity and imbalance, and efficiently extracting fine-grained local features.
<!--  %与日常生活中较为明显的宏表情（Macro-expressions,␣MaEs）相比，微表情虽然同样基于Ekman提出的基本情绪模型（如幸福、愤怒、悲伤、惊讶、恐惧和厌恶等），但二者存在显著差异。首先，在外观上，微表情是局限于面部局部区域的微小且快速的肌肉运动，通常持续时间仅为1/25到1/3秒。这种特点使得微表情的检测与识别比宏表情更加困难（见图␣\ref{fig:ME_MaE}）。其次，在表情注释方面，虽然␣ME␣在面部编码方面与␣MaE␣有相似之处，但它们的瞬态和微妙性质需要在分析过程中进行更多的专业知识和细致的手动注释。心理学家经常使用面部动作编码系统␣（FACS）␣\cite{ekman1978facial}␣来分析␣MEs，但这个过程既费时又费力。最后，在微表情表征学习方面，虽然许多在宏表情识别中取得成功的模型（如卷积神经网络CNN、循环神经网络RNN、Transformer等）开始在微表情领域得到应用，但微表情的瞬时性、局部性和微弱性对这些模型提出了更高的要求，需要特别关注三个方面：优化模型的学习过程、解决数据样本稀疏与不平衡问题，以及有效提取细粒度的局部特征。 --></p>
    </para>
    <para xml:id="S1.p3">
      <p>Regarding model optimization, the complexity and subtlety of MEs make efficient recognition challenging when relying solely on data-driven feature learning. The integration of prior knowledge has been identified as an effective strategy to optimize the learning process and improve model performance. For instance, research has demonstrated that leveraging facial micro-movement patterns <cite class="ltx_citemacro_cite">[<bibref bibrefs="allaert2019micro" separator="," yyseparator=","/>]</cite>, the relationships between different Action Units (AUs) <cite class="ltx_citemacro_cite">[<bibref bibrefs="lei2021micro" separator="," yyseparator=","/>]</cite>, and transferring knowledge from MaEs <cite class="ltx_citemacro_cite">[<bibref bibrefs="sun2020dynamic" separator="," yyseparator=","/>]</cite> can enhance the model’s ability to identify and learn ME features. To address the issues of sample sparsity and imbalance, researchers have employed various data augmentation strategies <cite class="ltx_citemacro_cite">[<bibref bibrefs="li2020local" separator="," yyseparator=","/>]</cite>, such as rotating, scaling, flipping, or generating new samples, to expand the dataset and alleviate sample imbalance. Transfer learning techniques have also been applied, transferring knowledge from larger datasets related to similar tasks, which improves model adaptability in the target domain <cite class="ltx_citemacro_cite">[<bibref bibrefs="tang2024facial,gan2024transfer" separator="," yyseparator=","/>]</cite>. Furthermore, meta-learning approaches provide innovative solutions for overcoming data limitations <cite class="ltx_citemacro_cite">[<bibref bibrefs="wang2025cross,gong2023meta" separator="," yyseparator=","/>]</cite>. In terms of feature extraction, CNNs have been widely used in MER tasks <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhou2023coutfitgan,zhou2024learning" separator="," yyseparator=","/>]</cite>. However, due to the subtle muscle movements involved in MEs, CNNs still face limitations in capturing such fine-grained details. To address this issue, recent studies have introduced various attention mechanisms <cite class="ltx_citemacro_cite">[<bibref bibrefs="guo2022attention,cai2024mfdan,wu2024micro" separator="," yyseparator=","/>]</cite>, which dynamically adjust weights and integrate global information to improve model adaptability and accuracy.
<!--  %在模型优化方面，当仅依赖数据驱动的特征学习时，ME␣的复杂性和微妙性使得有效识别具有挑战性。先前知识的整合已被确定为优化学习过程和提高模型性能的有效策略。例如，研究表明，利用面部微运动模式␣\cite{allaert2019micro}、不同动作单元␣（AU）␣\cite{lei2021micro}␣之间的关系以及从␣MaEs␣\cite{sun2020dynamic}␣转移知识可以增强模型识别和学习␣ME␣特征的能力。为了解决样本稀疏和不平衡的问题，研究人员采用了各种数据增强策略，例如旋转、缩放、翻转或生成新样本␣\cite{li2020local}，以扩展数据集并缓解样本不平衡。还应用了迁移学习技术，从与类似任务相关的大型数据集中转移知识，从而提高了模型在目标领域的适应性␣\cite{tang2024facial，␣gan2024transfer}。此外，元学习方法为克服数据限制提供了创新的解决方案␣\cite{wang2025cross，␣gong2023meta}。在特征提取方面，CNN␣已广泛应用于␣MER␣任务␣\cite{zhou2023coutfitgan，␣zhou2024learning}。然而，由于␣ME␣涉及细微的肌肉运动，CNN␣在捕获如此精细的细节方面仍然面临限制。为了解决这个问题，最近的研究引入了各种注意力机制␣\cite{guo2022attention，cai2024mfdan，wu2024micro}，动态调整权重并整合全局信息，以提高模型的适应性和准确性。 --></p>
    </para>
    <para xml:id="S1.p4">
      <p>Despite these advancements, MER still faces numerous challenges. For example, the prior knowledge used in model optimization is often overly simplistic or incomplete, preventing the model from fully realizing its potential <cite class="ltx_citemacro_cite">[<bibref bibrefs="arpit2017closer,zhang2023facial" separator="," yyseparator=","/>]</cite>. Data augmentation methods, such as sample synthesis, may introduce misleading artifacts into the model <cite class="ltx_citemacro_cite">[<bibref bibrefs="uchinoura2024improved" separator="," yyseparator=","/>]</cite>, negatively impacting its generalization ability. Furthermore, due to significant domain differences between ME and MaE, the effectiveness of transfer learning may be limited <cite class="ltx_citemacro_cite">[<bibref bibrefs="weiss2017comparing,zong2018domain" separator="," yyseparator=","/>]</cite>. In feature extraction, although attention mechanisms have been applied, further improvements are needed to efficiently capture the subtle characteristics of MEs.
<!--  %尽管取得了这些进步，MER␣仍然面临许多挑战。例如，模型优化中使用的先验知识往往过于简单或不完整，从而阻碍了模型充分发挥其潜力。数据增强方法（例如样本合成）可能会在模型中引入误导性伪影，从而对其泛化能力产生负面影响。此外，由于␣ME␣和␣MaE␣之间的显着领域差异，迁移学习的有效性可能会受到限制。在特征提取中，尽管已经应用了注意力机制，但需要进一步改进才能有效地捕捉␣ME␣的细微特征。 --></p>
    </para>
    <para xml:id="S1.p5">
      <p>To address these challenges, we propose a Multi-Prior Fusion Network (MPFNet) based on a progressive training strategy, aiming to improve MER performance from three perspectives: model optimization, data processing, and feature extraction. Our approach is inspired by the multi-stage cognitive development process in human infants <cite class="ltx_citemacro_cite">[<bibref bibrefs="he2020multi" separator="," yyseparator=","/>]</cite>, where infants progressively deepen their understanding of the world through continuous interactions between domain-general learning mechanisms and evolving environmental experiences. In the early stages, infants learn basic object features by recognizing similarities and differences. As cognitive abilities mature, they can distinguish more complex features and perform more advanced classification tasks. This simple-to-complex cognitive progression provides inspiration for our model design. Specifically, we first develop a triplet network designed to minimize the feature distance within the same category while maximizing the distance between features from different categories in the embedding space during training. This contrastive learning strategy generates an encoder capable of effectively extracting general ME features. Subsequently, we utilize a self-constructed motion-enhanced, sample-balanced ME dataset to train an advanced feature encoder, enabling the capture of more complex ME features.
<!--  %针对上述挑战，本文提出了一种基于渐进式训练策略的多先验融合网络（MPFNet），从模型优化、数据处理和特征提取三个方面综合提升微表情识别性能。我们的方法受到人类婴儿多阶段认知发展过程的启发␣\cite{he2020multi}。婴儿通过领域通用的学习机制与不断发展的环境经验的持续互动，逐步加深对世界的理解。在认知发展的早期阶段，婴儿通过识别相似性和差异性学习基本的物体特征。随着认知能力的提升，他们能够区分更复杂的特征，从而完成更高级的分类任务。这种从简单到复杂的认知过程为我们的模型设计提供了灵感。具体而言，我们首先开发了一种三重网络，在训练过程中最小化同类特征的距离，同时最大化嵌入空间中不同类别特征的距离。这种对比学习策略生成了一个能够有效提取通用微表情特征的编码器（GFE）。随后，我们利用经过运动增强、样本平衡和扩展数据集规模的自建ME数据集，训练了一个高级特征编码器（AFE），以捕获更复杂的微表情特征。 --></p>
    </para>
    <para xml:id="S1.p6">
      <p>To optimize the synergy between these two encoders, we design two architectures: MPFNet-P with parallel feature encoders and MPFNet-C with cascaded feature encoders. These architectures implement distinct mechanisms for integrating prior knowledge, corresponding to two fundamental modes of infant cognitive development: parallel processing and hierarchical processing. This design is grounded in established theories from developmental psychology. Specifically, Lewkowicz <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="lewkowicz2009emergence" separator="," yyseparator=","/>]</cite> argued that infant cognition does not follow a linear processing pattern but instead involves simultaneous engagement with multiple dimensions of information without strict hierarchical prioritization. Correspondingly, Cohen <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="cohen2002constructivist" separator="," yyseparator=","/>]</cite> proposed a hierarchical cognitive development theory, suggesting that infants’ learning systems exhibit a structured hierarchy, where the ability to process complex information is progressively built upon lower-level processing capabilities. Building on these theoretical foundations, the parallel architecture of MPFNet-P simulates the infant cognitive mode of synchronous multisensory integration, whereas the cascaded architecture of MPFNet-C reflects the hierarchical information processing mode, where higher-level features are progressively constructed upon lower-level representations. Both the MPFNet-P and MPFNet-C architectures employ the Inflated 3D Convolutional Networks (I3D) <cite class="ltx_citemacro_cite">[<bibref bibrefs="carreira2017quo" separator="," yyseparator=","/>]</cite> as their encoder backbone, augmented with Coordinate Attention (CA) blocks <cite class="ltx_citemacro_cite">[<bibref bibrefs="hou2021coordinate" separator="," yyseparator=","/>]</cite>. This integration, termed the CA-I3D model, enhances the network’s ability to extract meaningful spatiotemporal features by performing 3D convolutions across consecutive frames.
<!--  %为了优化这两个编码器之间的协同作用，我们设计了两种架构：带有并行特性编码器的␣MPFNet-P␣和带有级联特性编码器的␣MPFNet-C。这些架构实现了整合先验知识的不同机制，对应于婴儿认知发展的两种基本模式：并行处理和分层处理。这种设计以发展心理学的既定理论为基础。具体来说，Lewkowicz␣\textit{et␣al.}␣\cite{lewkowicz2009emergence}␣认为，婴儿认知不遵循线性处理模式，而是涉及同时参与多个维度的信息，而没有严格的分层优先级。相应地，Cohen␣\textit{et␣al.}␣\cite{cohen2002constructivist}␣提出了一种分层认知发展理论，认为婴儿的学习系统表现出结构化的层次结构，其中处理复杂信息的能力逐渐建立在较低级别的处理能力之上。在这些理论基础上，MPFNet-P␣的并行架构模拟了同步多感官整合的婴儿认知模式，而␣MPFNet-C␣的级联架构反映了分层信息处理模式，其中较高级别的特征在较低级别的表征上逐步构建。MPFNet-P␣和␣MPFNet-C␣架构都使用膨胀的␣3D␣卷积网络␣（I3D）␣\cite{carreira2017quo}␣作为其编码器主干，并通过坐标注意力␣（CA）␣块␣\cite{hou2021coordinate}␣进行了增强。这种集成称为␣CA-I3D␣模型，通过在连续帧中执行␣3D␣卷积来增强网络提取有意义的时空特征的能力。 --></p>
    </para>
    <para xml:id="S1.p7">
      <p>Furthermore, we introduce meta-learning to simulate the rapid adaptability observed in infants as they learn different tasks. Meta-learning trains the model across multiple tasks, enabling it to efficiently extract prior knowledge for adapting to new ones. Finally, to evaluate MPFNet’s performance, we conduct experiments on several publicly ME datasets. The results show that both MPFNet-P and MPFNet-C outperform the baseline model in MER tasks, with MPFNet-C achieving particularly strong results. This demonstrates that our progressive training strategy effectively integrates multi-level prior knowledge, enhancing the model’s ability to classify MEs.
<!--  %****␣manuscript.tex␣Line␣625␣**** 
     %此外，我们进一步引入了元学习的概念，以模拟婴儿在不同任务中的快速适应能力。元学习通过在多任务上训练模型，使其能够高效提取适应新任务的先验知识。最后，为评估MPFNet的性能表现，我们在多个公开微表情数据集上进行了系统性实验。实验结果表明，与基线模型相比，MPFNet-P和MPFNet-C在微表情识别任务中均取得了优异的表现，其中MPFNet-C表现尤为突出。这一结果验证了本文提出的渐进式训练策略能够有效整合多层次先验知识，增强模型分类微表情的能力。--></p>
    </para>
    <para xml:id="S1.p8">
      <p>To sum up, the main contributions of this research are:</p>
    </para>
    <para xml:id="S1.p9">
      <enumerate xml:id="S1.I1">
        <item xml:id="S1.I1.i1">
          <tags>
            <tag>1.</tag>
            <tag role="autoref">item 1</tag>
            <tag role="refnum">1</tag>
            <tag role="typerefnum">item 1</tag>
          </tags>
          <para xml:id="S1.I1.i1.p1">
            <p>We propose a novel MPFNet to extract both generic and advanced features of MEs, leveraging complementary prior knowledge to enhance MER. MPFNet comprises two variants: MPFNet-P and MPFNet-C, which explore prior fusion strategies from feature diversity and hierarchy perspectives.
<!--  %我们提出了一种新的多先验融合网络来提取␣MEs␣的通用和高级特征，利用互补的先验知识来提高␣MER␣性能。MPFNet␣包括两个变体：MPFNet-P␣和␣MPFNet-C，它们分别从特征多样性和层次结构的角度探索先前的融合策略。 --></p>
          </para>
        </item>
        <item xml:id="S1.I1.i2">
          <tags>
            <tag>2.</tag>
            <tag role="autoref">item 2</tag>
            <tag role="refnum">2</tag>
            <tag role="typerefnum">item 2</tag>
          </tags>
          <para xml:id="S1.I1.i2.p1">
            <p>To address the challenges of limited samples and class imbalance in ME datasets, we propose a data augmentation method using dynamic motion magnification. To capture critical spatiotemporal information of MEs, we utilize the CA-I3D model as the backbone for feature encoder, integrating an attention mechanism into the I3D framework to effectively model channel relationships and long-term dependencies.
<!--  %为了解决␣ME␣数据集中样本量有限和类别分布不平衡的问题，我们提出了一种基于动态运动放大的数据增强方法。为了更好地捕捉␣ME␣的关键时空信息，我们利用␣CA-I3D␣模型作为特征编码器的主干，将注意力机制集成到␣I3D␣框架中，以有效地对通道关系和长期依赖关系进行建模。 --></p>
          </para>
        </item>
        <item xml:id="S1.I1.i3">
          <tags>
            <tag>3.</tag>
            <tag role="autoref">item 3</tag>
            <tag role="refnum">3</tag>
            <tag role="typerefnum">item 3</tag>
          </tags>
          <para xml:id="S1.I1.i3.p1">
            <p>Extensive experiments and visual analyses demonstrate that our approach overcomes the limitations of traditional single-prior knowledge. By integrating multiple complementary priors, we significantly enhance model performance, improving overall classification accuracy while ensuring balanced results across categories, thus achieving competitive performance.
<!--  %广泛的实验和视觉分析表明，我们的方法克服了传统单一先验知识的局限性。通过集成多个互补的先验，我们显著提高了模型性能，提高了整体分类准确性，同时确保不同类别的平衡结果，从而获得有竞争力的性能。 --></p>
          </para>
        </item>
      </enumerate>
<!--  %我们还对比了不同融合策略对模型复杂度和运行效率的影响，发现␣MPFNet␣在保持高效性的同时，能够有效平衡特征学习的深度和广度。 -->    </para>
    <para xml:id="S1.p10">
      <p>The remainder of this paper is structured as follows. Section <ref labelref="LABEL:related_work"/> reviews related work, followed by the details of MPFNet in Section <ref labelref="LABEL:method"/>. Section <ref labelref="LABEL:experiments"/> introduces experimental data, evaluation metrics and implementation details. Section <ref labelref="LABEL:Results_and_analysis"/> presents the experimental results, ablation study and visualization analysis. Section <ref labelref="LABEL:conclusion"/> concludes the paper. Finally, Section <ref labelref="LABEL:Ethical"/> discusses the ethical issues related to this research.
<!--  %本文的其余部分结构如下。\ref{related_work}␣部分回顾了相关工作，然后在␣\ref{method}␣部分回顾了␣MPFNet␣的详细信息。部分␣\ref{experiments}␣介绍了实验数据、评估指标和实现细节。部分␣\ref{Results␣and␣analysis}␣介绍了实验结果、消融研究和可视化分析。第␣\ref{conclusion}␣节对论文进行了总结。最后，第␣\ref{Ethical␣impact␣statement}部分对本研究涉及的伦理问题进行讨论。 --></p>
    </para>
  </section>
  <section inlist="toc" labels="LABEL:related_work" xml:id="S2">
    <tags>
      <tag>2</tag>
      <tag role="autoref">section 2</tag>
      <tag role="refnum">2</tag>
      <tag role="typerefnum">§2</tag>
    </tags>
    <title><tag close=" ">2</tag><text font="smallcaps">Related work</text></title>
    <para xml:id="S2.p1">
      <p>This study focuses on three key areas of MER: model learning, data augmentation, and feature extraction. In this section, we first provide a brief review of related studies in these areas. Building on this foundation, we then analyze the unique characteristics and innovations of our work, highlighting its advantages over existing approaches.
<!--  %本研究主要关注␣MER␣的三个关键领域：模型学习、数据增强和特征提取。本节首先简要回顾这些领域的相关研究，然后分析本研究的独特之处和创新之处，突出其相对于现有方法的优势。 --></p>
    </para>
    <subsection inlist="toc" xml:id="S2.SS1">
      <tags>
        <tag>2.1</tag>
        <tag role="autoref">subsection 2.1</tag>
        <tag role="refnum">2.1</tag>
        <tag role="typerefnum">§2.1</tag>
      </tags>
      <title><tag close=" ">2.1</tag><text font="italic">Prior learning strategies in ME analysis models</text></title>
<!--  %****␣manuscript.tex␣Line␣650␣**** -->      <para xml:id="S2.SS1.p1">
        <p>Studies have shown that incorporating prior knowledge into deep learning models can effectively guide them to focus on critical features of MEs, achieving superior performance in tasks such as ME recognition, spotting, and generation. For MER, Sun <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="sun2020dynamic" separator="," yyseparator=","/>]</cite> distilled and transferred knowledge from facial action units (AUs), using features from the teacher network as prior knowledge to guide the student part to effectively learn from the target ME dataset. Additionally, Wei <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="wei2023prior" separator="," yyseparator=","/>]</cite> proposed a decomposition and reconstruction graph representation learning model, integrating prior knowledge of the relationship between different AUs, improving the model’s interpretability and feature learning capabilities. For ME spotting, Yin <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="yin2023aware" separator="," yyseparator=","/>]</cite> encoded prior knowledge about motion patterns of MEs into the network, improving spatial feature embedding and alleviating over-fitting. Besides, Wang <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="wang2021mesnet" separator="," yyseparator=","/>]</cite> also tried to solve ME spotting through convolutional neural networks and constrained the network’s complexity by introducing additional distribution prior knowledge. This approach helps alleviate the overfitting problem in ME detection. In the field of ME generation, several studies have demonstrated that utilizing prior knowledge of MEs can significantly improve the quality of reconstructed videos. For instance, Zhang <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhang2023facial" separator="," yyseparator=","/>]</cite> proposed a facial prior-guided ME generation framework, which utilized a facial prior module to guide the motion representation and generation of ME, significantly improving the performance of the ME generation model.
Although prior learning strategies have been applied in ME analysis, the prior knowledge utilized in existing studies remains relatively simple and lacks the integration of multi-source information. In this study, we utilize the capabilities to learn both generic and advanced features as complementary prior knowledge to assist in MER. To the best of our knowledge, no existing studies in the literature have adopted the concept of multi-prior fusion for MER.
<!--  %ME␣分析模型中的先验学习策略。研究表明，将先验知识融入深度学习模型可以有效引导模型聚焦于ME的关键特征，从而在ME识别、发现和生成等任务中取得优异表现。对于MER，Sun␣\textit{et␣al.}␣\cite{sun2020dynamic}从面部动作单元（AU）中提炼并迁移知识用于MER。该方法使用从教师网络中提取的特征作为先验知识来引导学生部分从目标ME数据集中有效学习。此外，Wei␣\textit{et␣al.}␣\cite{wei2023prior}提出了一种分解重构图表征学习模型。该模型融合了不同面部动作单元之间关系的先验知识，提高了模型的可解释性和特征学习能力。对于ME发现，Yin␣\textit{et␣al.}␣\cite{yin2023aware}设计了一种策略，将有关ME运动模式的先验知识编码到网络中，改进空间特征嵌入并缓解过拟合。此外，王␣\textit{et␣al.}␣\cite{wang2021mesnet}␣也尝试通过卷积神经网络解决␣ME␣检测问题，并通过引入额外的分布先验知识来限制网络的复杂性。这种方法有助于缓解微表情检测中的过拟合问题。在␣ME␣生成领域，一些研究表明，利用␣ME␣的先验知识可以显著提高重建视频的质量。例如，张␣\textit{et␣al.}␣\cite{zhang2023facial}␣提出了一个面部先验引导的␣ME␣生成框架，该框架利用面部先验模块来指导␣ME␣的运动表示和生成，显著提高了␣ME␣生成模型的性能。虽然先验学习策略已经应用于␣ME␣分析，但现有研究中使用的先验知识仍然相对简单，缺乏多源信息的整合。在本研究中，我们利用学习通用特征和高级特征的能力作为补充先验知识来辅助␣MER。据我们所知，现有文献中尚无研究采用多先验融合的概念进行␣MER。 --></p>
      </para>
      <figure inlist="lof" labels="LABEL:figs:MPFNet" placement="t" xml:id="S2.F2">
        <tags>
          <tag>Fig. 2</tag>
          <tag role="autoref">Figure 2</tag>
          <tag role="refnum">2</tag>
          <tag role="typerefnum">Fig. 2</tag>
        </tags>
        <graphics candidates="MPFNet.pdf" class="ltx_centering" graphic="MPFNet.pdf" options="width=390.258pt" xml:id="S2.F2.g1"/>
        <toccaption class="ltx_centering"><tag close=" ">2</tag>The overall architecture of the proposed MPFNet, which comprises two distinct variants: MPFNet-P and MPFNet-C. Both variants incorporate two feature encoders—the Generic Feature Encoder (GFE) and the Advanced Feature Encoder (AFE). MPFNet-P employs a parallel encoder architecture, whereas MPFNet-C utilizes a cascaded encoder structure, each optimized for feature extraction through their respective configurations.</toccaption>
        <caption class="ltx_centering"><tag close=": ">Fig. 2</tag>The overall architecture of the proposed MPFNet, which comprises two distinct variants: MPFNet-P and MPFNet-C. Both variants incorporate two feature encoders—the Generic Feature Encoder (GFE) and the Advanced Feature Encoder (AFE). MPFNet-P employs a parallel encoder architecture, whereas MPFNet-C utilizes a cascaded encoder structure, each optimized for feature extraction through their respective configurations.</caption>
      </figure>
    </subsection>
    <subsection inlist="toc" xml:id="S2.SS2">
      <tags>
        <tag>2.2</tag>
        <tag role="autoref">subsection 2.2</tag>
        <tag role="refnum">2.2</tag>
        <tag role="typerefnum">§2.2</tag>
      </tags>
      <title><tag close=" ">2.2</tag><text font="italic">Data enhancement methods to tackle few-shot and imbalanced data</text></title>
      <para xml:id="S2.SS2.p1">
        <p>Addressing the challenges of few-shot learning and class imbalance in MER has been a significant focus of recent research. Various approaches have been proposed to tackle these issues effectively, including data augmentation, transfer learning, and meta-learning. For example, Xia <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="xia2019spatiotemporal" separator="," yyseparator=","/>]</cite> employed temporal data augmentation strategies to enhance the limited training samples and utilized a balanced loss function to address the issue of imbalanced training. Subsequently, Xie <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="xie2020assisted" separator="," yyseparator=","/>]</cite> proposed a data augmentation method to generate ME images using the action units intensity extracted from MEs as training conditions to alleviate the limited and unbalanced problem of existing MER datasets. Additionally, transfer learning is another widely used technique, leveraging knowledge from related tasks or larger datasets to improve performance of MER. Xia <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="xia2020learning" separator="," yyseparator=","/>]</cite> proposed a MER framework that leverages MaE samples for guidance and employs an adversarial learning strategy and triplet loss to capture the shared features of ME and MaE samples. More recently, to address the issue of insufficient data for MER, Tang <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="tang2024facial" separator="," yyseparator=","/>]</cite> proposed a dual graph convolutional network architecture with transfer learning. However, augmented data poses a risk of overfitting due to its similarity to the original data. Synthetic samples can introduce artificial artifacts that absent in the natural world, potentially misleading the model. Moreover, substantial domain discrepancy between MEs and MaEs may hinder the effectiveness of transfer learning. To address these challenges, this study proposes a data augmentation method based on dynamic motion magnification. This approach not only enhances the intensity of subtle movements in ME videos but also dynamically adjusts the augmentation effect based on the distribution of the original samples. As a result, it effectively mitigates issues of limited sample size and imbalanced class distribution in ME datasets.
<!--  %数据增强方法解决少样本和不平衡数据。解决␣MER␣中的少样本学习和类别不平衡挑战一直是近期研究的重点。已经提出了各种方法来有效地解决这些问题，包括数据增强、迁移学习和元学习。例如，Xia␣\textit{et␣al.}␣\cite{xia2019spatiotemporal}␣采用时间数据增强策略来增强有限的训练样本，并利用平衡损失函数来解决训练不平衡的问题。随后，Xie␣\textit{et␣al.}␣\cite{xie2020assisted}␣提出了一种数据增强方法，使用从␣ME␣中提取的动作单元强度作为训练条件来生成␣ME␣图像，以缓解现有␣MER␣数据集有限和不平衡的问题。此外，迁移学习是另一种广泛使用的技术，它利用相关任务或更大数据集中的知识来提高␣MER␣的性能。␣Xia␣\textit{et␣al.}␣\cite{xia2020learning}␣提出了一个␣MER␣框架，该框架利用␣MaE␣样本作为指导，并采用对抗性学习策略和三重态损失来捕获␣ME␣和␣MaE␣样本的共享特征。最近，为了解决␣MER␣数据不足的问题，Tang␣\textit{et␣al.}␣\cite{tang2024facial}␣提出了一种具有迁移学习的双图卷积网络架构。然而，由于增强数据与原始数据非常相似，因此存在过度拟合的风险。合成样本可能会引入自然界中不存在的人工伪影，从而可能误导模型。此外，ME␣和␣MaE␣之间巨大的领域差异可能会阻碍迁移学习的有效性。针对这些挑战，本研究提出了一种基于动态运动放大的数据增强方法。该方法不仅可以增强␣ME␣视频中细微运动的强度，还可以根据原始样本的分布动态调整增强效果。因此，它有效地缓解了␣ME␣数据集中样本量有限和类别分布不平衡的问题。 --></p>
      </para>
    </subsection>
    <subsection inlist="toc" xml:id="S2.SS3">
      <tags>
        <tag>2.3</tag>
        <tag role="autoref">subsection 2.3</tag>
        <tag role="refnum">2.3</tag>
        <tag role="typerefnum">§2.3</tag>
      </tags>
      <title><tag close=" ">2.3</tag><text font="italic">Feature extraction methods for MER</text></title>
      <para xml:id="S2.SS3.p1">
        <p>Due to the subtle and difficult-to-discern movements of facial muscles in MEs, the effectiveness of MER largely depends on the discriminative features. Recent studies have primarily focused on leveraging high-level features derived from deep learning models. For instance, Zhao <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhao2021two" separator="," yyseparator=","/>]</cite> employed 3D convolutional neural networks (3D-CNNs) to encode both spatial and temporal information, aiming to capture comprehensive representations, including motion cues and long-sequence dependencies. Similarly, Thuseethan <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="thuseethan2023deep3dcann" separator="," yyseparator=","/>]</cite> used 3D-CNNs to learn useful spatiotemporal features from facial images and then combined the learned features and the semantic relationships between the regions to predict the MEs. However, traditional 3D-CNNs demand substantial parameters and computational resources, which constrains their ability to capture the subtle variations in MEs. Recently, the attention mechanism enables models to concentrate on the most pertinent aspects of input data, leading to its extensive adoption in the field of MEs research. For example, Zhou <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhou2023inceptr" separator="," yyseparator=","/>]</cite> proposed a dual-branch attention network for MER, which adopts the convolutional block attention module (CBAM) to enable the model to capture the most discriminative multi-scale local and global features. Shu <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="shu2023res" separator="," yyseparator=","/>]</cite> incorporated a squeeze-excitation (SE) block into the network. This SE block highlights valuable ME features while suppressing irrelevant ones. More recently, Liong <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="liong2024sfamnet" separator="," yyseparator=","/>]</cite> proposed a multi-stream MER network based on attention mechanism, which can predict recognition confidence scores and emotion labels. The above studies demonstrate the effectiveness of the attention mechanism for MER. To capture essential temporal, spatial, and channel-specific features for MER, we integrate the CA block with the I3D model. The CA block captures cross-channel, direction-aware, and position-sensitive information, while the I3D model excels at extracting robust spatiotemporal features, facilitating comprehensive modeling of temporal dynamics in ME videos. This integration enhances the model’s ability to detect key features and significantly improves its sensitivity to subtle variations in MEs, resulting in a more accurate and robust feature representation for MER.
<!--  %MER␣的特征提取方法。由于␣ME␣中面部肌肉的运动细微且难以辨别，MER␣的有效性很大程度上取决于判别特征。最近的研究主要集中于利用来自深度学习模型的高级特征。例如，Zhao␣\textit{et␣al.}␣\cite{zhao2021two}␣使用␣3D␣卷积神经网络␣(3D-CNN)␣来编码空间和时间信息，旨在捕获全面的表示，包括运动线索和长序列依赖关系。同样，Thuseethan␣\textit{et␣al.}␣\cite{thuseethan2023deep3dcann}␣使用␣3D-CNN␣从面部图像中学习有用的时空特征，然后结合学习到的特征和区域之间的语义关系来预测␣ME。然而，传统的␣3D-CNN␣需要大量的参数和计算资源，这限制了它们捕捉␣ME␣中细微变化的能力。最近，注意力机制使模型能够专注于输入数据中最相关的方面，从而在微表情研究领域得到广泛应用。例如，Zhou␣\textit{et␣al.}␣\cite{zhou2023inceptr}␣提出了一种用于␣MER␣的双分支注意力网络，该网络采用卷积块注意力模块␣(CBAM)␣使模型能够捕获最具判别力的多尺度局部和全局特征。Shu␣\textit{et␣al.}␣\cite{shu2023res}␣在网络中加入了一个挤压激励␣(SE)␣块。这个␣SE␣块突出了有价值的微表情特征，同时抑制了不相关的特征。最近，Liong␣\textit{et␣al.}␣\cite{liong2024sfamnet}␣提出了一种基于注意力机制的多流␣MER␣网络，可以预测识别置信度分数和情绪标签。上述研究证明了注意力机制对␣MER␣的有效性。为了捕捉␣MER␣所需的时间、空间和通道特定特征，我们将␣CA␣块与␣I3D␣模型集成。CA␣块可捕捉跨通道、方向感知和位置敏感的信息，而␣I3D␣模型则擅长提取稳健的时空特征，从而有助于全面建模␣ME␣视频中的时间动态。这种集成增强了模型检测关键特征的能力，并显著提高了其对␣ME␣细微变化的敏感度，从而为识别任务提供更准确、更稳健的特征表示。 --></p>
      </para>
    </subsection>
  </section>
  <section inlist="toc" labels="LABEL:method" xml:id="S3">
    <tags>
      <tag>3</tag>
      <tag role="autoref">section 3</tag>
      <tag role="refnum">3</tag>
      <tag role="typerefnum">§3</tag>
    </tags>
    <title><tag close=" ">3</tag><text font="smallcaps">Methodology</text></title>
<!--  %****␣manuscript.tex␣Line␣675␣**** -->    <para xml:id="S3.p1">
      <p>Human beings can acquire new skills with just a few examples and learn even faster when faced with novel, related tasks. This ability is attributed to the human capacity to learn and utilize various forms of prior knowledge <cite class="ltx_citemacro_cite">[<bibref bibrefs="maguire1999functional" separator="," yyseparator=","/>]</cite>. Related studies have also leveraged prior knowledge from other domains through pre-trained deep neural networks <cite class="ltx_citemacro_cite">[<bibref bibrefs="wei2023prior,yin2023aware" separator="," yyseparator=","/>]</cite>. Inspired by this, we propose a novel multi-prior fusion network (MPFNet) for MER based on a progressive training strategy. This Section begins with an overview of the proposed MPFNet architecture in Section <ref labelref="LABEL:sec:_overview"/>. Section <ref labelref="LABEL:sec:_Data_preprocessing"/> describes the preprocessing steps for the input data. In Section <ref labelref="LABEL:sec:_Pre-training"/>, we provide a detailed explanation of the pretraining process of the feature encoder. Each stage is specifically designed to correspond to the learning process of different types of prior knowledge. To investigate effective strategies for integrating multiple types of prior knowledge, we introduce two model variants, MPFNet-P and MPFNet-C, with their operational mechanisms analyzed in Section <ref labelref="LABEL:sec:_variants"/>. Finally, Section <ref labelref="LABEL:sec:_classification"/> discusses the design and implementation of the classification module, focusing on the classifier architecture and the selection of the loss function.
<!--  %人类只需几个例子就能掌握新技能，在面对新的相关任务时，学习速度甚至更快。这种能力归功于人类学习和利用各种形式的先验知识的能力␣\cite{maguire1999␣functional}。相关研究还通过预训练的深度神经网络利用了来自其他领域的先验知识␣\cite{wei2023prior,␣yin2023aware}。受此启发，我们基于渐进式训练策略为␣MER␣提出了一种新型多先验融合网络␣(MPFNet)。本章首先在␣A␣节中概述了所提出的␣MPFNet␣架构。B␣节描述了输入数据的预处理步骤。在␣C␣节中，我们详细解释了特征编码器的预训练过程。每个阶段都经过专门设计，以对应不同类型的先验知识的学习过程。为了研究整合多种类型的先验知识的有效策略，我们介绍了两种模型变体␣MPFNet-P␣和␣MPFNet-C，并在␣D␣节中分析了它们的运行机制。最后，E␣节讨论了分类模块的设计和实现，重点介绍了分类器架构和损失函数的选择。 --></p>
    </para>
    <subsection inlist="toc" labels="LABEL:sec:_overview" xml:id="S3.SS1">
      <tags>
        <tag>3.1</tag>
        <tag role="autoref">subsection 3.1</tag>
        <tag role="refnum">3.1</tag>
        <tag role="typerefnum">§3.1</tag>
      </tags>
      <title><tag close=" ">3.1</tag><text font="italic">Overview of the MPFNet architecture</text></title>
      <para xml:id="S3.SS1.p1">
        <p>The overall architecture of MPFNet is illustrated in Fig. <ref labelref="LABEL:figs:MPFNet"/>. Operating within a metric-based meta-learning framework, MPFNet processes both the support set and the query set as inputs and outputs classification results for the query set. MPFNet offers two model variants: MPFNet-P and MPFNet-C. Both variants consist of two feature encoders: the Generic Feature Encoder (GFE) and the Advanced Feature Encoder (AFE). These encoders are designed to capture prior knowledge at distinct levels of abstraction, thereby enhancing feature representation and classification performance. MPFNet-P employs a parallel encoder architecture, facilitating the model to extract features from multiple perspectives and capture more diverse information. In contrast, MPFNet-C utilizes a cascaded encoder architecture, which progressively refines feature representations to capture subtle and discriminative characteristics. This stepwise refinement allows the model to gradually focus on the subtle yet critical features embedded in MEs, further enhancing recognition performance. By comparing these two variants, we aim to evaluate the impact and effectiveness of different prior knowledge integration strategies on MER.
<!--  %MPFNet␣的整体架构如图␣\ref{fig:MPFNet}␣所示。MPFNet␣在基于度量的元学习框架内运行，将支持集和查询集作为输入进行处理，并输出查询集的分类结果。MPFNet␣提供两种模型变体：MPFNet-P␣和␣MPFNet-C。两种变体都包含两个特征编码器：通用特征编码器␣(GFE)␣和高级特征编码器␣(AFE)。这些编码器旨在捕获不同抽象级别的先验知识，从而增强特征表示和分类性能。MPFNet-P␣采用并行编码器架构，便于模型从多个角度提取特征并捕获更多样化的信息。相比之下，MPFNet-C␣采用级联编码器架构，逐步细化特征表示以捕获细微和判别性特征。这种逐步细化使模型能够逐渐关注␣ME␣中嵌入的细微但关键的特征，从而进一步提高识别性能。通过比较这两种变体，我们旨在评估不同的先验知识整合策略对␣MER␣的影响和有效性。 --></p>
      </para>
<!--  %其中，GFE␣专注于捕获低级基本模式，而␣AFE␣则专注于识别高级抽象表示 -->    </subsection>
    <subsection inlist="toc" labels="LABEL:sec:_Data_preprocessing" xml:id="S3.SS2">
      <tags>
        <tag>3.2</tag>
        <tag role="autoref">subsection 3.2</tag>
        <tag role="refnum">3.2</tag>
        <tag role="typerefnum">§3.2</tag>
      </tags>
      <title><tag close=" ">3.2</tag><text font="italic">Data preprocessing</text></title>
      <para xml:id="S3.SS2.p1">
        <p>In this section, we implement a comprehensive preprocessing pipeline to extract robust facial features. First, we employ the face recognition algorithm provided by Alibaba Cloud<note mark="1" role="footnote" xml:id="footnote1"><tags>
              <tag>1</tag>
              <tag role="autoref">footnote 1</tag>
              <tag role="refnum">1</tag>
              <tag role="typerefnum">footnote 1</tag>
            </tags>urlhttps://vision.aliyun.com/facebody</note> to perform precise face detection, alignment, and cropping on ME images, minimizing the interference from non-facial areas and head pose variations. The cropped facial regions are resized to 128<Math mode="inline" tex="\times" text="*" xml:id="S3.SS2.p1.m1">
            <XMath>
              <XMTok meaning="times" role="MULOP">×</XMTok>
            </XMath>
          </Math>128 pixels. Subsequently, to address the significant variability in ME sequence lengths and the inherent noise in high-speed camera recordings, we adopt a keyframe-based frame interpolation method for sequence normalization. Following this, we extract and integrate two complementary feature modalities—inter-frame optical flow and frame difference features—to comprehensively represent the spatiotemporal characteristics of MEs. The preprocessing workflow is illustrated in Fig. <ref labelref="LABEL:figs:Feature_fuse"/>.
<!--  %在本节中，我们实现了一个全面的预处理管道来提取稳健的面部特征。首先，我们采用阿里云提供的人脸识别算法\footnote{url{https://vision.aliyun.com/facebody}}␣对␣ME␣图像进行精确的人脸检测、对齐和裁剪，最大限度地减少非面部区域和头部姿势变化的干扰。裁剪的面部区域将调整为␣128$\times$128␣像素。随后，为了解决␣ME␣序列长度的显着变化和高速摄像机记录中的固有噪声，我们采用基于关键帧的帧插值方法进行序列归一化。在此之后，我们提取并整合了两种互补的特征模态——帧间光流和帧差特征——以综合表示␣MEs␣的时空特征。预处理工作流程如图␣1␣所示。\ref{figs：Feature_fuse}␣. --></p>
      </para>
      <figure inlist="lof" labels="LABEL:figs:Feature_fuse" placement="t" xml:id="S3.F3">
        <tags>
          <tag>Fig. 3</tag>
          <tag role="autoref">Figure 3</tag>
          <tag role="refnum">3</tag>
          <tag role="typerefnum">Fig. 3</tag>
        </tags>
        <graphics candidates="Feature_fuse.pdf" class="ltx_centering" graphic="Feature_fuse.pdf" options="width=403.2666pt" xml:id="S3.F3.g1"/>
        <toccaption class="ltx_centering"><tag close=" ">3</tag>Data preprocessing steps, including the generation of ME frames with normalized length using VFI, followed by the computation and integration of optical flow and frame difference features.</toccaption>
        <caption class="ltx_centering"><tag close=": ">Fig. 3</tag>Data preprocessing steps, including the generation of ME frames with normalized length using VFI, followed by the computation and integration of optical flow and frame difference features.</caption>
      </figure>
<!--  %****␣manuscript.tex␣Line␣700␣**** -->      <subsubsection inlist="toc" xml:id="S3.SS2.SSS1">
        <tags>
          <tag>3.2.1</tag>
          <tag role="autoref">subsubsection 3.2.1</tag>
          <tag role="refnum">3.2.1</tag>
          <tag role="typerefnum">§3.2.1</tag>
        </tags>
        <title><tag close=" ">3.2.1</tag>Frame interpolation method</title>
        <para xml:id="S3.SS2.SSS1.p1">
          <p>In ME analysis, the frame sequence from the onset frame to the apex frame effectively reflects the dynamic characteristics of facial muscle movements and their evolving trends. However, these sequences often exhibit inconsistent frame lengths, ranging from 9 to over 100 frames. For sequences with a higher frame count, direct downsampling may result in unsmooth motion trends and potential loss of critical motion information. Conversely, sequences with insufficient frames require effective upsampling strategies to reconstruct their motion information. To address these issues, we employs a keyframe-based video frame interpolation (VFI) algorithm, designed to achieve two primary objectives: (i) generating standardized-length frame sequences, and (ii) preserving the temporal dynamics of MEs, thereby producing smoother and more continuous motion patterns that are crucial for capturing subtle ME features. Specifically, we utilize the VFI model proposed by Zhang et al. <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhang2023extracting" separator="," yyseparator=","/>]</cite>. This model innovatively restructures the information processing mechanism of inter-frame attention, enhancing appearance feature representation through attention maps and effectively capturing motion dynamics. In implementation, the onset frame <Math mode="inline" tex="I_{o}" text="I _ o" xml:id="S3.SS2.SSS1.p1.m1">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">I</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">o</XMTok>
                </XMApp>
              </XMath>
            </Math> and apex frame <Math mode="inline" tex="I_{a}" text="I _ a" xml:id="S3.SS2.SSS1.p1.m2">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">I</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">a</XMTok>
                </XMApp>
              </XMath>
            </Math> are used as inputs to the VFI model to produce fixed-length ME sequences of 11 frames. The interpolation process can be formulated as:
<!--  %在␣ME␣分析中，从起始帧到顶点帧的帧序列有效地反映了面部肌肉运动的动态特征及其演变趋势。但是，这些序列通常表现出不一致的帧长度，从␣9␣帧到超过␣100␣帧不等。对于帧数较高的序列，直接下采样可能会导致运动趋势不平滑，并可能丢失关键运动信息。相反，帧数不足的序列需要有效的上采样策略来重建其运动信息。为了解决这些问题，我们采用了基于关键帧的视频帧插值␣（VFI）␣算法，旨在实现两个主要目标：（i）␣生成标准化长度的帧序列，以及␣（ii）␣保留␣ME␣的时间动态，从而产生更平滑、更连续的运动模式，这对于捕捉细微的␣ME␣特征至关重要。具体来说，我们利用了␣Zhang␣等人提出的␣VFI␣模型␣\cite{zhang2023extracting}。该模型创新性地重构了帧间注意力的信息处理机制，通过注意力图增强了外观特征表示，并有效捕捉了运动动态。在实施中，起始帧␣$I_o$␣和顶点帧␣$I_a$␣用作␣VFI␣模型的输入，以生成␣11␣帧的固定长度␣ME␣序列。插值过程可以表述为： --></p>
        </para>
        <para xml:id="S3.SS2.SSS1.p2">
          <equation xml:id="S3.E1">
            <tags>
              <tag>(1)</tag>
              <tag role="autoref">Equation 1</tag>
              <tag role="refnum">1</tag>
            </tags>
            <Math mode="display" tex="I_{t}=VFI(I_{o},I_{a},t),\quad t\in\{1,2,\ldots,9\}," text="formulae@(I _ t = V * F * I * vector@(I _ o, I _ a, t), t element-of set@(1, 2, ldots, 9))" xml:id="S3.E1.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S3.E1.m1.6"/>
                  <XMWrap>
                    <XMDual xml:id="S3.E1.m1.6">
                      <XMApp>
                        <XMTok meaning="formulae"/>
                        <XMRef idref="S3.E1.m1.6.1"/>
                        <XMRef idref="S3.E1.m1.6.2"/>
                      </XMApp>
                      <XMWrap>
                        <XMApp xml:id="S3.E1.m1.6.1">
                          <XMTok meaning="equals" role="RELOP">=</XMTok>
                          <XMApp>
                            <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                            <XMTok font="italic" role="UNKNOWN">I</XMTok>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                          </XMApp>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok font="italic" role="UNKNOWN">V</XMTok>
                            <XMTok font="italic" role="UNKNOWN">F</XMTok>
                            <XMTok font="italic" role="UNKNOWN">I</XMTok>
                            <XMDual>
                              <XMApp>
                                <XMTok meaning="vector"/>
                                <XMRef idref="S3.E1.m1.6.1.1"/>
                                <XMRef idref="S3.E1.m1.6.1.2"/>
                                <XMRef idref="S3.E1.m1.1"/>
                              </XMApp>
                              <XMWrap>
                                <XMTok role="OPEN" stretchy="false">(</XMTok>
                                <XMApp xml:id="S3.E1.m1.6.1.1">
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                  <XMTok font="italic" role="UNKNOWN">I</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">o</XMTok>
                                </XMApp>
                                <XMTok role="PUNCT">,</XMTok>
                                <XMApp xml:id="S3.E1.m1.6.1.2">
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                  <XMTok font="italic" role="UNKNOWN">I</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">a</XMTok>
                                </XMApp>
                                <XMTok role="PUNCT">,</XMTok>
                                <XMTok font="italic" role="UNKNOWN" xml:id="S3.E1.m1.1">t</XMTok>
                                <XMTok role="CLOSE" stretchy="false">)</XMTok>
                              </XMWrap>
                            </XMDual>
                          </XMApp>
                        </XMApp>
                        <XMTok role="PUNCT" rpadding="10.0pt">,</XMTok>
                        <XMApp xml:id="S3.E1.m1.6.2">
                          <XMTok meaning="element-of" name="in" role="RELOP">∈</XMTok>
                          <XMTok font="italic" role="UNKNOWN">t</XMTok>
                          <XMDual>
                            <XMApp>
                              <XMTok meaning="set"/>
                              <XMRef idref="S3.E1.m1.2"/>
                              <XMRef idref="S3.E1.m1.3"/>
                              <XMRef idref="S3.E1.m1.4"/>
                              <XMRef idref="S3.E1.m1.5"/>
                            </XMApp>
                            <XMWrap>
                              <XMTok role="OPEN" stretchy="false">{</XMTok>
                              <XMTok meaning="1" role="NUMBER" xml:id="S3.E1.m1.2">1</XMTok>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMTok meaning="2" role="NUMBER" xml:id="S3.E1.m1.3">2</XMTok>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMTok name="ldots" role="ID" xml:id="S3.E1.m1.4">…</XMTok>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMTok meaning="9" role="NUMBER" xml:id="S3.E1.m1.5">9</XMTok>
                              <XMTok role="CLOSE" stretchy="false">}</XMTok>
                            </XMWrap>
                          </XMDual>
                        </XMApp>
                      </XMWrap>
                    </XMDual>
                    <XMTok role="PUNCT">,</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
          <p>where <Math mode="inline" tex="I_{t}" text="I _ t" xml:id="S3.SS2.SSS1.p2.m1">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">I</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                </XMApp>
              </XMath>
            </Math> represents the interpolated frame at time step <Math mode="inline" tex="t" text="t" xml:id="S3.SS2.SSS1.p2.m2">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">t</XMTok>
              </XMath>
            </Math>, and <Math mode="inline" tex="VFI(\cdot)" text="V * F * I * cdot" xml:id="S3.SS2.SSS1.p2.m3">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">V</XMTok>
                  <XMTok font="italic" role="UNKNOWN">F</XMTok>
                  <XMTok font="italic" role="UNKNOWN">I</XMTok>
                  <XMDual>
                    <XMRef idref="S3.SS2.SSS1.p2.m3.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMTok name="cdot" role="MULOP" xml:id="S3.SS2.SSS1.p2.m3.1">⋅</XMTok>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> denotes the frame interpolation function.</p>
        </para>
        <para xml:id="S3.SS2.SSS1.p3">
          <p>The selection of 11 frames strikes a balance between capturing high-resolution motion features and maintaining computational efficiency. Furthermore, this length selection is informed by the successful practices of previous studies <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhao2021two,guo2019extended,khor2018enriched,li2013spontaneous" separator="," yyseparator=","/>]</cite>. This approach not only preserves the spatiotemporal information of the original ME videos and eliminates redundant frames but also accentuates the motion dynamics of the peak frame, providing robust input data for subsequent feature extraction and classification. The impact of this hyperparameter on model performance is analyzed in Section <ref labelref="LABEL:Results_and_analysis"/>.
<!--  %11␣帧的选择在捕捉高分辨率运动特征和保持计算效率之间取得了平衡。此外，这种长度选择借鉴了以前研究的成功实践␣\cite{zhao2021two,␣guo2019extended,␣khor2018enriched,␣li2013spontaneous}。这种方法不仅保留了原始␣ME␣视频的时空信息并消除了冗余帧，而且还突出了峰值帧的运动动态，为后续的特征提取和分类提供了可靠的输入数据。该超参数对模型性能的影响在第五章进行了分析。 --></p>
        </para>
      </subsubsection>
      <subsubsection inlist="toc" xml:id="S3.SS2.SSS2">
        <tags>
          <tag>3.2.2</tag>
          <tag role="autoref">subsubsection 3.2.2</tag>
          <tag role="refnum">3.2.2</tag>
          <tag role="typerefnum">§3.2.2</tag>
        </tags>
        <title><tag close=" ">3.2.2</tag>Calculation and integration of optical flow and frame difference features</title>
        <figure inlist="lof" labels="LABEL:figs:OF_FD" placement="b" xml:id="S3.F4">
          <tags>
            <tag>Fig. 4</tag>
            <tag role="autoref">Figure 4</tag>
            <tag role="refnum">4</tag>
            <tag role="typerefnum">Fig. 4</tag>
          </tags>
          <graphics candidates="OF_FD.pdf" class="ltx_centering" graphic="OF_FD.pdf" options="width=346.896pt" xml:id="S3.F4.g1"/>
          <toccaption class="ltx_centering"><tag close=" ">4</tag>The process of obtaining optical flow and frame difference features. It can be seen that both features progressively become more prominent, effectively capturing the movement patterns of MEs.</toccaption>
          <caption class="ltx_centering"><tag close=": ">Fig. 4</tag>The process of obtaining optical flow and frame difference features. It can be seen that both features progressively become more prominent, effectively capturing the movement patterns of MEs.</caption>
        </figure>
        <para xml:id="S3.SS2.SSS2.p1">
          <p>For the fixed-length ME frame sequences, we compute inter-frame optical flow and frame difference features as handcrafted inputs for the model. Optical flow features capture pixel-level motion between frames, while frame difference features represent the intensity variations of each pixel between consecutive frames. These features are crucial for capturing the subtle muscular movements essential for MER and are inherently complementary. In this study, we employ the deep learning-based optical flow estimation method, FlowNet 2.0<note mark="2" role="footnote" xml:id="footnote2"><tags>
                <tag>2</tag>
                <tag role="autoref">footnote 2</tag>
                <tag role="refnum">2</tag>
                <tag role="typerefnum">footnote 2</tag>
              </tags><ref class="ltx_url" font="typewriter" href="https://github.com/NVIDIA/flownet2-pytorch">https://github.com/NVIDIA/flownet2-pytorch</ref></note>, which has proven effective in detecting subtle motions in videos, to extract optical flow features. The optical flow between two consecutive frames <Math mode="inline" tex="I_{t}" text="I _ t" xml:id="S3.SS2.SSS2.p1.m1">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">I</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                </XMApp>
              </XMath>
            </Math> and <Math mode="inline" tex="I_{t+1}" text="I _ (t + 1)" xml:id="S3.SS2.SSS2.p1.m2">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">I</XMTok>
                  <XMApp>
                    <XMTok fontsize="70%" meaning="plus" role="ADDOP">+</XMTok>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                    <XMTok fontsize="70%" meaning="1" role="NUMBER">1</XMTok>
                  </XMApp>
                </XMApp>
              </XMath>
            </Math> is computed as a displacement vector field <Math mode="inline" tex="(u_{t},v_{t})" text="open-interval@(u _ t, v _ t)" xml:id="S3.SS2.SSS2.p1.m3">
              <XMath>
                <XMDual>
                  <XMApp>
                    <XMTok meaning="open-interval"/>
                    <XMRef idref="S3.SS2.SSS2.p1.m3.1"/>
                    <XMRef idref="S3.SS2.SSS2.p1.m3.2"/>
                  </XMApp>
                  <XMWrap>
                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                    <XMApp xml:id="S3.SS2.SSS2.p1.m3.1">
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">u</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                    <XMApp xml:id="S3.SS2.SSS2.p1.m3.2">
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">v</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                    </XMApp>
                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>, where <Math mode="inline" tex="u_{t}" text="u _ t" xml:id="S3.SS2.SSS2.p1.m4">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">u</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                </XMApp>
              </XMath>
            </Math> and <Math mode="inline" tex="v_{t}" text="v _ t" xml:id="S3.SS2.SSS2.p1.m5">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">v</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                </XMApp>
              </XMath>
            </Math> represent the horizontal and vertical displacements, respectively. This can be expressed as:
<!--  %对于固定长度的␣ME␣帧序列，我们计算帧间光流和帧差异特征作为模型的手工输入。光流特征捕获帧之间的像素级运动，而帧差异特征表示连续帧之间每个像素的强度变化。这些特征对于捕捉␣MER␣所必需的细微肌肉运动至关重要，并且本质上是互补的。在本研究中，我们采用基于深度学习的光流估计方法␣FlowNet␣2.0\footnote{\url{https://github.com/NVIDIA/flownet2-pytorch}}␣来提取光流特征，该方法已被证明可有效检测视频中的细微运动。两个连续帧␣$I_t$␣和␣$I_{t+1}$␣之间的光流被计算为位移矢量场␣$(u_t,v_t)$，其中␣$u_t$␣和␣$v_t$␣分别表示水平和垂直位移。这可以表示为： 
     %****␣manuscript.tex␣Line␣725␣****--></p>
        </para>
        <para xml:id="S3.SS2.SSS2.p2">
          <equation xml:id="S3.E2">
            <tags>
              <tag>(2)</tag>
              <tag role="autoref">Equation 2</tag>
              <tag role="refnum">2</tag>
            </tags>
            <Math mode="display" tex="\left(u_{t},v_{t}\right)={FlowNet}\left(I_{t},I_{t+1}\right),\quad t\in\{0,1,%&#10;\ldots,9\}." text="formulae@(open-interval@(u _ t, v _ t) = F * l * o * w * N * e * t * open-interval@(I _ t, I _ (t + 1)), t element-of set@(0, 1, ldots, 9))" xml:id="S3.E2.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S3.E2.m1.5"/>
                  <XMWrap>
                    <XMDual xml:id="S3.E2.m1.5">
                      <XMApp>
                        <XMTok meaning="formulae"/>
                        <XMRef idref="S3.E2.m1.5.1"/>
                        <XMRef idref="S3.E2.m1.5.2"/>
                      </XMApp>
                      <XMWrap>
                        <XMApp xml:id="S3.E2.m1.5.1">
                          <XMTok meaning="equals" role="RELOP">=</XMTok>
                          <XMDual>
                            <XMApp>
                              <XMTok meaning="open-interval"/>
                              <XMRef idref="S3.E2.m1.5.1.1"/>
                              <XMRef idref="S3.E2.m1.5.1.2"/>
                            </XMApp>
                            <XMWrap>
                              <XMTok role="OPEN" stretchy="true">(</XMTok>
                              <XMApp xml:id="S3.E2.m1.5.1.1">
                                <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                <XMTok font="italic" role="UNKNOWN">u</XMTok>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                              </XMApp>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMApp xml:id="S3.E2.m1.5.1.2">
                                <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                <XMTok font="italic" role="UNKNOWN">v</XMTok>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                              </XMApp>
                              <XMTok role="CLOSE" stretchy="true">)</XMTok>
                            </XMWrap>
                          </XMDual>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok font="italic" role="UNKNOWN">F</XMTok>
                            <XMTok font="italic" role="UNKNOWN">l</XMTok>
                            <XMTok font="italic" role="UNKNOWN">o</XMTok>
                            <XMTok font="italic" role="UNKNOWN">w</XMTok>
                            <XMTok font="italic" role="UNKNOWN">N</XMTok>
                            <XMTok font="italic" role="UNKNOWN">e</XMTok>
                            <XMTok font="italic" role="UNKNOWN">t</XMTok>
                            <XMDual>
                              <XMApp>
                                <XMTok meaning="open-interval"/>
                                <XMRef idref="S3.E2.m1.5.1.3"/>
                                <XMRef idref="S3.E2.m1.5.1.4"/>
                              </XMApp>
                              <XMWrap>
                                <XMTok role="OPEN" stretchy="true">(</XMTok>
                                <XMApp xml:id="S3.E2.m1.5.1.3">
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                  <XMTok font="italic" role="UNKNOWN">I</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                                </XMApp>
                                <XMTok role="PUNCT">,</XMTok>
                                <XMApp xml:id="S3.E2.m1.5.1.4">
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                  <XMTok font="italic" role="UNKNOWN">I</XMTok>
                                  <XMApp>
                                    <XMTok fontsize="70%" meaning="plus" role="ADDOP">+</XMTok>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                                    <XMTok fontsize="70%" meaning="1" role="NUMBER">1</XMTok>
                                  </XMApp>
                                </XMApp>
                                <XMTok role="CLOSE" stretchy="true">)</XMTok>
                              </XMWrap>
                            </XMDual>
                          </XMApp>
                        </XMApp>
                        <XMTok role="PUNCT" rpadding="10.0pt">,</XMTok>
                        <XMApp xml:id="S3.E2.m1.5.2">
                          <XMTok meaning="element-of" name="in" role="RELOP">∈</XMTok>
                          <XMTok font="italic" role="UNKNOWN">t</XMTok>
                          <XMDual>
                            <XMApp>
                              <XMTok meaning="set"/>
                              <XMRef idref="S3.E2.m1.1"/>
                              <XMRef idref="S3.E2.m1.2"/>
                              <XMRef idref="S3.E2.m1.3"/>
                              <XMRef idref="S3.E2.m1.4"/>
                            </XMApp>
                            <XMWrap>
                              <XMTok role="OPEN" stretchy="false">{</XMTok>
                              <XMTok meaning="0" role="NUMBER" xml:id="S3.E2.m1.1">0</XMTok>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMTok meaning="1" role="NUMBER" xml:id="S3.E2.m1.2">1</XMTok>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMTok name="ldots" role="ID" xml:id="S3.E2.m1.3">…</XMTok>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMTok meaning="9" role="NUMBER" xml:id="S3.E2.m1.4">9</XMTok>
                              <XMTok role="CLOSE" stretchy="false">}</XMTok>
                            </XMWrap>
                          </XMDual>
                        </XMApp>
                      </XMWrap>
                    </XMDual>
                    <XMTok role="PERIOD">.</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
        </para>
        <para xml:id="S3.SS2.SSS2.p3">
          <p>The resulting optical flow features for the entire sequence are represented as a tensor <Math mode="inline" tex="\mathcal{O}\in{R}^{128\times 128\times 2\times 10}" text="O element-of R ^ (128 * 128 * 2 * 10)" xml:id="S3.SS2.SSS2.p3.m1">
              <XMath>
                <XMApp>
                  <XMTok meaning="element-of" name="in" role="RELOP">∈</XMTok>
                  <XMTok font="caligraphic" role="UNKNOWN">O</XMTok>
                  <XMApp>
                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">R</XMTok>
                    <XMApp>
                      <XMTok fontsize="70%" meaning="times" role="MULOP">×</XMTok>
                      <XMTok fontsize="70%" meaning="128" role="NUMBER">128</XMTok>
                      <XMTok fontsize="70%" meaning="128" role="NUMBER">128</XMTok>
                      <XMTok fontsize="70%" meaning="2" role="NUMBER">2</XMTok>
                      <XMTok fontsize="70%" meaning="10" role="NUMBER">10</XMTok>
                    </XMApp>
                  </XMApp>
                </XMApp>
              </XMath>
            </Math>, where 2 corresponds to the horizontal and vertical displacement components, and 10 denotes the number of inter-frame pairs.
<!--  %得到的整个序列的光流特征表示为张量␣$\mathcal{O}␣\in␣{R}^{128␣\times␣128␣\times␣2␣\times␣10}$，其中␣2␣对应水平和垂直位移分量，10␣表示帧间对的数量。 -->The frame difference features are computed as the pixel-wise intensity difference between consecutive frames for each RGB channel. For a given channel <Math mode="inline" tex="c\in\{R,G,B\}" text="c element-of set@(R, G, B)" xml:id="S3.SS2.SSS2.p3.m2">
              <XMath>
                <XMApp>
                  <XMTok meaning="element-of" name="in" role="RELOP">∈</XMTok>
                  <XMTok font="italic" role="UNKNOWN">c</XMTok>
                  <XMDual>
                    <XMApp>
                      <XMTok meaning="set"/>
                      <XMRef idref="S3.SS2.SSS2.p3.m2.1"/>
                      <XMRef idref="S3.SS2.SSS2.p3.m2.2"/>
                      <XMRef idref="S3.SS2.SSS2.p3.m2.3"/>
                    </XMApp>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">{</XMTok>
                      <XMTok font="italic" role="UNKNOWN" xml:id="S3.SS2.SSS2.p3.m2.1">R</XMTok>
                      <XMTok role="PUNCT">,</XMTok>
                      <XMTok font="italic" role="UNKNOWN" xml:id="S3.SS2.SSS2.p3.m2.2">G</XMTok>
                      <XMTok role="PUNCT">,</XMTok>
                      <XMTok font="italic" role="UNKNOWN" xml:id="S3.SS2.SSS2.p3.m2.3">B</XMTok>
                      <XMTok role="CLOSE" stretchy="false">}</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math>, the frame difference <Math mode="inline" tex="\Delta I_{t}^{c}" text="Delta * (I _ t) ^ c" xml:id="S3.SS2.SSS2.p3.m3">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok name="Delta" role="UNKNOWN">Δ</XMTok>
                  <XMApp>
                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                    <XMApp>
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">I</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                    </XMApp>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                  </XMApp>
                </XMApp>
              </XMath>
            </Math> at time <Math mode="inline" tex="t" text="t" xml:id="S3.SS2.SSS2.p3.m4">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">t</XMTok>
              </XMath>
            </Math> is calculated as:
<!--  %帧差异特征计算为每个␣RGB␣通道连续帧之间的逐像素强度差异。对于给定通道␣$c␣\in␣\{R,G,B\}$，时间␣$t$␣处的帧差异␣$\Delta␣I_{t}^{c}$␣计算如下： --></p>
        </para>
        <para xml:id="S3.SS2.SSS2.p4">
          <equation xml:id="S3.E3">
            <tags>
              <tag>(3)</tag>
              <tag role="autoref">Equation 3</tag>
              <tag role="refnum">3</tag>
            </tags>
            <Math mode="display" tex="\Delta I_{t}^{c}=\left|I_{t+1}^{c}-I_{t}^{c}\right|,\quad t\in\{0,1,\ldots,9\}." text="formulae@(Delta * (I _ t) ^ c = absolute-value@((I _ (t + 1)) ^ c - (I _ t) ^ c), t element-of set@(0, 1, ldots, 9))" xml:id="S3.E3.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S3.E3.m1.5"/>
                  <XMWrap>
                    <XMDual xml:id="S3.E3.m1.5">
                      <XMApp>
                        <XMTok meaning="formulae"/>
                        <XMRef idref="S3.E3.m1.5.1"/>
                        <XMRef idref="S3.E3.m1.5.2"/>
                      </XMApp>
                      <XMWrap>
                        <XMApp xml:id="S3.E3.m1.5.1">
                          <XMTok meaning="equals" role="RELOP">=</XMTok>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok name="Delta" role="UNKNOWN">Δ</XMTok>
                            <XMApp>
                              <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                              <XMApp>
                                <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                <XMTok font="italic" role="UNKNOWN">I</XMTok>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                              </XMApp>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                            </XMApp>
                          </XMApp>
                          <XMDual>
                            <XMApp>
                              <XMTok meaning="absolute-value"/>
                              <XMRef idref="S3.E3.m1.5.1.1"/>
                            </XMApp>
                            <XMWrap>
                              <XMTok role="VERTBAR" stretchy="true">|</XMTok>
                              <XMApp xml:id="S3.E3.m1.5.1.1">
                                <XMTok meaning="minus" role="ADDOP">-</XMTok>
                                <XMApp>
                                  <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                  <XMApp>
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                    <XMTok font="italic" role="UNKNOWN">I</XMTok>
                                    <XMApp>
                                      <XMTok fontsize="70%" meaning="plus" role="ADDOP">+</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                                      <XMTok fontsize="70%" meaning="1" role="NUMBER">1</XMTok>
                                    </XMApp>
                                  </XMApp>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                </XMApp>
                                <XMApp>
                                  <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                  <XMApp>
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                    <XMTok font="italic" role="UNKNOWN">I</XMTok>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">t</XMTok>
                                  </XMApp>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                </XMApp>
                              </XMApp>
                              <XMTok role="VERTBAR" stretchy="true">|</XMTok>
                            </XMWrap>
                          </XMDual>
                        </XMApp>
                        <XMTok role="PUNCT" rpadding="10.0pt">,</XMTok>
                        <XMApp xml:id="S3.E3.m1.5.2">
                          <XMTok meaning="element-of" name="in" role="RELOP">∈</XMTok>
                          <XMTok font="italic" role="UNKNOWN">t</XMTok>
                          <XMDual>
                            <XMApp>
                              <XMTok meaning="set"/>
                              <XMRef idref="S3.E3.m1.1"/>
                              <XMRef idref="S3.E3.m1.2"/>
                              <XMRef idref="S3.E3.m1.3"/>
                              <XMRef idref="S3.E3.m1.4"/>
                            </XMApp>
                            <XMWrap>
                              <XMTok role="OPEN" stretchy="false">{</XMTok>
                              <XMTok meaning="0" role="NUMBER" xml:id="S3.E3.m1.1">0</XMTok>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMTok meaning="1" role="NUMBER" xml:id="S3.E3.m1.2">1</XMTok>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMTok name="ldots" role="ID" xml:id="S3.E3.m1.3">…</XMTok>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMTok meaning="9" role="NUMBER" xml:id="S3.E3.m1.4">9</XMTok>
                              <XMTok role="CLOSE" stretchy="false">}</XMTok>
                            </XMWrap>
                          </XMDual>
                        </XMApp>
                      </XMWrap>
                    </XMDual>
                    <XMTok role="PERIOD">.</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
        </para>
        <para xml:id="S3.SS2.SSS2.p5">
          <p>The resulting frame difference features for the entire sequence are represented as a tensor <Math mode="inline" tex="\mathcal{D}\in{R}^{128\times 128\times 3\times 10}" text="D element-of R ^ (128 * 128 * 3 * 10)" xml:id="S3.SS2.SSS2.p5.m1">
              <XMath>
                <XMApp>
                  <XMTok meaning="element-of" name="in" role="RELOP">∈</XMTok>
                  <XMTok font="caligraphic" role="UNKNOWN">D</XMTok>
                  <XMApp>
                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">R</XMTok>
                    <XMApp>
                      <XMTok fontsize="70%" meaning="times" role="MULOP">×</XMTok>
                      <XMTok fontsize="70%" meaning="128" role="NUMBER">128</XMTok>
                      <XMTok fontsize="70%" meaning="128" role="NUMBER">128</XMTok>
                      <XMTok fontsize="70%" meaning="3" role="NUMBER">3</XMTok>
                      <XMTok fontsize="70%" meaning="10" role="NUMBER">10</XMTok>
                    </XMApp>
                  </XMApp>
                </XMApp>
              </XMath>
            </Math>, where 3 corresponds to the RGB channels.
<!--  %得到的整个序列的帧差异特征表示为张量␣$\mathcal{D}␣\in␣{R}^{128␣\times␣128␣\times␣3␣\times␣10}$，其中␣3␣对应于␣RGB␣通道。 -->To integrate the optical flow features <Math mode="inline" tex="\mathcal{O}" text="O" xml:id="S3.SS2.SSS2.p5.m2">
              <XMath>
                <XMTok font="caligraphic" role="UNKNOWN">O</XMTok>
              </XMath>
            </Math> and frame difference features <Math mode="inline" tex="\mathcal{D}" text="D" xml:id="S3.SS2.SSS2.p5.m3">
              <XMath>
                <XMTok font="caligraphic" role="UNKNOWN">D</XMTok>
              </XMath>
            </Math>, we concatenate them along the channel dimension, resulting in a fused feature tensor <Math mode="inline" tex="\mathcal{F}\in{R}^{128\times 128\times 5\times 10}" text="F element-of R ^ (128 * 128 * 5 * 10)" xml:id="S3.SS2.SSS2.p5.m4">
              <XMath>
                <XMApp>
                  <XMTok meaning="element-of" name="in" role="RELOP">∈</XMTok>
                  <XMTok font="caligraphic" role="UNKNOWN">F</XMTok>
                  <XMApp>
                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">R</XMTok>
                    <XMApp>
                      <XMTok fontsize="70%" meaning="times" role="MULOP">×</XMTok>
                      <XMTok fontsize="70%" meaning="128" role="NUMBER">128</XMTok>
                      <XMTok fontsize="70%" meaning="128" role="NUMBER">128</XMTok>
                      <XMTok fontsize="70%" meaning="5" role="NUMBER">5</XMTok>
                      <XMTok fontsize="70%" meaning="10" role="NUMBER">10</XMTok>
                    </XMApp>
                  </XMApp>
                </XMApp>
              </XMath>
            </Math>:
<!--  %为了整合光流特征␣$\mathcal{O}$␣和帧差异特征␣$\mathcal{D}$，我们沿通道维度将它们连接起来，得到一个融合特征张量␣$\mathcal{F}␣\in␣{R}^{128␣\times␣128␣\times␣5␣\times␣10}$： --></p>
        </para>
        <para xml:id="S3.SS2.SSS2.p6">
          <equation xml:id="S3.E4">
            <tags>
              <tag>(4)</tag>
              <tag role="autoref">Equation 4</tag>
              <tag role="refnum">4</tag>
            </tags>
            <Math mode="display" tex="\mathcal{F}={Concat}(\mathcal{O},\mathcal{D})," text="F = C * o * n * c * a * t * open-interval@(O, D)" xml:id="S3.E4.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S3.E4.m1.3"/>
                  <XMWrap>
                    <XMApp xml:id="S3.E4.m1.3">
                      <XMTok meaning="equals" role="RELOP">=</XMTok>
                      <XMTok font="caligraphic" role="UNKNOWN">F</XMTok>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMTok font="italic" role="UNKNOWN">C</XMTok>
                        <XMTok font="italic" role="UNKNOWN">o</XMTok>
                        <XMTok font="italic" role="UNKNOWN">n</XMTok>
                        <XMTok font="italic" role="UNKNOWN">c</XMTok>
                        <XMTok font="italic" role="UNKNOWN">a</XMTok>
                        <XMTok font="italic" role="UNKNOWN">t</XMTok>
                        <XMDual>
                          <XMApp>
                            <XMTok meaning="open-interval"/>
                            <XMRef idref="S3.E4.m1.1"/>
                            <XMRef idref="S3.E4.m1.2"/>
                          </XMApp>
                          <XMWrap>
                            <XMTok role="OPEN" stretchy="false">(</XMTok>
                            <XMTok font="caligraphic" role="UNKNOWN" xml:id="S3.E4.m1.1">O</XMTok>
                            <XMTok role="PUNCT">,</XMTok>
                            <XMTok font="caligraphic" role="UNKNOWN" xml:id="S3.E4.m1.2">D</XMTok>
                            <XMTok role="CLOSE" stretchy="false">)</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMApp>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
          <p>where <Math mode="inline" tex="Concat(\cdot)" text="C * o * n * c * a * t * cdot" xml:id="S3.SS2.SSS2.p6.m1">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">C</XMTok>
                  <XMTok font="italic" role="UNKNOWN">o</XMTok>
                  <XMTok font="italic" role="UNKNOWN">n</XMTok>
                  <XMTok font="italic" role="UNKNOWN">c</XMTok>
                  <XMTok font="italic" role="UNKNOWN">a</XMTok>
                  <XMTok font="italic" role="UNKNOWN">t</XMTok>
                  <XMDual>
                    <XMRef idref="S3.SS2.SSS2.p6.m1.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMTok name="cdot" role="MULOP" xml:id="S3.SS2.SSS2.p6.m1.1">⋅</XMTok>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> denotes the concatenation operation along the channel dimension. This fused feature combines both motion dynamics (via optical flow) and intensity variations (via frame differences), providing a comprehensive representation of the ME sequence for subsequent analysis.
The extraction of optical flow and frame difference features is shown in Fig. <ref labelref="LABEL:figs:OF_FD"/>. Notably, as the ME progresses, both features become increasingly pronounced.
<!--  %其中$␣concat（\␣cdot）$表示沿通道维度的串联操作。该融合的特征既捕获运动动力学（通过光流）和强度变化（通过帧差异），从而提供了ME序列的全面表示，以进行后续分析。提取光流和框架差异特征的过程在图中说明了\␣ref␣{图：of_fd}。值得注意的是，随着ME的进展，光流和框架差异特征都变得越来越明显。 
     %****␣manuscript.tex␣Line␣750␣****--></p>
        </para>
      </subsubsection>
    </subsection>
    <subsection inlist="toc" labels="LABEL:sec:_Pre-training" xml:id="S3.SS3">
      <tags>
        <tag>3.3</tag>
        <tag role="autoref">subsection 3.3</tag>
        <tag role="refnum">3.3</tag>
        <tag role="typerefnum">§3.3</tag>
      </tags>
      <title><tag close=" ">3.3</tag><text font="italic">Pre-training of feature encoders</text></title>
      <figure inlist="lof" labels="LABEL:figs:CA-I3D" placement="b" xml:id="S3.F5">
        <tags>
          <tag>Fig. 5</tag>
          <tag role="autoref">Figure 5</tag>
          <tag role="refnum">5</tag>
          <tag role="typerefnum">Fig. 5</tag>
        </tags>
        <graphics candidates="CA-I3D.pdf" class="ltx_centering" graphic="CA-I3D.pdf" options="width=390.258pt" xml:id="S3.F5.g1"/>
        <toccaption class="ltx_centering"><tag close=" ">5</tag>The structure of the proposed CA-I3D model. We optimize the original I3D network to make the model more suitable for MER task.</toccaption>
        <caption class="ltx_centering"><tag close=": ">Fig. 5</tag>The structure of the proposed CA-I3D model. We optimize the original I3D network to make the model more suitable for MER task.</caption>
      </figure>
      <para xml:id="S3.SS3.p1">
        <p>MPFNet comprises two feature encoders: the GFE and the AFE. The GFE is pretrained using a triplet network-based prior learning approach to extract general features for MER. Meanwhile, the AFE is pretrained on a larger, more balanced dataset derived from the original ME dataset using a motion amplification model, enabling it to capture advanced ME features. The prior knowledge acquired from both encoders is then utilized to initialize the model parameters as convolutional layer weights. Finally, the model is retrained on the original ME dataset, with encoder parameters fine-tuned to achieve accurate ME classification. Both the GFE and AFE utilize the CA-I3D architecture as their backbone, the details of which will be elaborated in the following section.
<!--  %MPFNet␣包含两个特征编码器：GFE␣和␣AFE。GFE␣使用基于三重网络的先验学习方法进行预训练，以提取␣MER␣的一般特征。同时，AFE␣使用运动放大模型在从原始␣ME␣数据集派生的更大、更平衡的数据集上进行预训练，使其能够捕获高级␣ME␣特征。然后利用从两个编码器获得的先验知识将模型参数初始化为卷积层权重。最后，在原始␣ME␣数据集上重新训练模型，并对编码器参数进行微调以实现准确的␣ME␣分类。GFE␣和␣AFE␣都使用␣CA-I3D␣架构作为其主干，其细节将在下一节中详细说明。 --></p>
      </para>
      <subsubsection inlist="toc" xml:id="S3.SS3.SSS1">
        <tags>
          <tag>3.3.1</tag>
          <tag role="autoref">subsubsection 3.3.1</tag>
          <tag role="refnum">3.3.1</tag>
          <tag role="typerefnum">§3.3.1</tag>
        </tags>
        <title><tag close=" ">3.3.1</tag>CA-I3D</title>
        <para xml:id="S3.SS3.SSS1.p1">
          <p>In this study, we integrate the I3D architecture with CA Block to construct the backbone of our feature encoder, termed CA-I3D, for precise spatiotemporal feature extraction of MEs. The I3D model is an extension of 2D convolutional networks, which introduces a temporal dimension by extending traditional 2D convolutional and pooling kernels into 3D forms, thereby enabling the modeling of dynamic information within video sequences. Meanwhile, the CA Block enhances feature representation by effectively capturing channel relationships and long-range dependencies through precise positional information. The overall architecture of the proposed CA-I3D model is illustrated in Fig. <ref labelref="LABEL:figs:CA-I3D"/>. The CA-I3D model comprises multiple 3D convolutional layers, max-pooling layers, and 3D CA-Inception v1 modules <cite class="ltx_citemacro_cite">[<bibref bibrefs="szegedy2015going" separator="," yyseparator=","/>]</cite>. It takes the fused feature tensor <Math mode="inline" tex="\mathcal{F}" text="F" xml:id="S3.SS3.SSS1.p1.m1">
              <XMath>
                <XMTok font="caligraphic" role="UNKNOWN">F</XMTok>
              </XMath>
            </Math> as input and produces the deep feature vector <Math mode="inline" tex="f_{\theta}(x)" text="f _ theta * x" xml:id="S3.SS3.SSS1.p1.m2">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">f</XMTok>
                    <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                  </XMApp>
                  <XMDual>
                    <XMRef idref="S3.SS3.SSS1.p1.m2.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMTok font="italic" role="UNKNOWN" xml:id="S3.SS3.SSS1.p1.m2.1">x</XMTok>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> for a given sample <Math mode="inline" tex="x" text="x" xml:id="S3.SS3.SSS1.p1.m3">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">x</XMTok>
              </XMath>
            </Math>, as defined below:
<!--  %在这项研究中，我们将␣I3D␣架构␣\cite{carreira2017quo}␣与坐标注意力␣（CA）␣模块\cite{hou2021coordinate}␣集成在一起，构建了我们的特征编码器的骨干，称为␣CA-I3D，用于␣ME␣的精确时空特征提取。I3D␣模型是␣2D␣卷积网络的扩展，它通过将传统的␣2D␣卷积和池化内核扩展为␣3D␣形式来引入时间维度，从而能够在视频序列中对动态信息进行建模。同时，CA␣Block␣通过精确的位置信息有效地捕获通道关系和长期依赖关系，从而增强了特征表示。所提出的␣CA-I3D␣模型的整体架构如图␣\ref{fig：CA-I3D}␣所示。CA-I3D␣模型由多个␣3D␣卷积层、最大池化层和␣3D␣CA-Inception␣v1␣模块组成。它以融合的特征张量␣$\mathcal{F}$␣作为输入，并为给定的样本␣$x$␣生成深度特征向量␣$f_{\theta}（x）$，定义如下： --></p>
        </para>
        <para xml:id="S3.SS3.SSS1.p2">
          <equation xml:id="S3.E5">
            <tags>
              <tag>(5)</tag>
              <tag role="autoref">Equation 5</tag>
              <tag role="refnum">5</tag>
            </tags>
            <Math mode="display" tex="f_{\theta}(x)=\textit{CA-I3D}(\mathcal{F})." text="f _ theta * x = [CA-I3D] * F" xml:id="S3.E5.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S3.E5.m1.3"/>
                  <XMWrap>
                    <XMApp xml:id="S3.E5.m1.3">
                      <XMTok meaning="equals" role="RELOP">=</XMTok>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMApp>
                          <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                          <XMTok font="italic" role="UNKNOWN">f</XMTok>
                          <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                        </XMApp>
                        <XMDual>
                          <XMRef idref="S3.E5.m1.1"/>
                          <XMWrap>
                            <XMTok role="OPEN" stretchy="false">(</XMTok>
                            <XMTok font="italic" role="UNKNOWN" xml:id="S3.E5.m1.1">x</XMTok>
                            <XMTok role="CLOSE" stretchy="false">)</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMApp>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMText><text font="italic">CA-I3D</text></XMText>
                        <XMDual>
                          <XMRef idref="S3.E5.m1.2"/>
                          <XMWrap>
                            <XMTok role="OPEN" stretchy="false">(</XMTok>
                            <XMTok font="caligraphic" role="UNKNOWN" xml:id="S3.E5.m1.2">F</XMTok>
                            <XMTok role="CLOSE" stretchy="false">)</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMApp>
                    </XMApp>
                    <XMTok role="PERIOD">.</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
        </para>
        <para xml:id="S3.SS3.SSS1.p3">
          <p>To better align with the requirements of MER tasks, we optimized the original I3D network architecture. The model begins with a 3<Math mode="inline" tex="\times" text="*" xml:id="S3.SS3.SSS1.p3.m1">
              <XMath>
                <XMTok meaning="times" role="MULOP">×</XMTok>
              </XMath>
            </Math>3<Math mode="inline" tex="\times" text="*" xml:id="S3.SS3.SSS1.p3.m2">
              <XMath>
                <XMTok meaning="times" role="MULOP">×</XMTok>
              </XMath>
            </Math>3 convolutional layer for spatial feature extraction, followed by a 1<Math mode="inline" tex="\times" text="*" xml:id="S3.SS3.SSS1.p3.m3">
              <XMath>
                <XMTok meaning="times" role="MULOP">×</XMTok>
              </XMath>
            </Math>3<Math mode="inline" tex="\times" text="*" xml:id="S3.SS3.SSS1.p3.m4">
              <XMath>
                <XMTok meaning="times" role="MULOP">×</XMTok>
              </XMath>
            </Math>3 max-pooling layer with strides of (1, 2, 2), which performs pooling along the height and width dimensions while preserving the channel and temporal dimensions.
The 3D CA-Inception v1 module employs multiple convolutional filters of varying sizes (e.g., 1<Math mode="inline" tex="\times" text="*" xml:id="S3.SS3.SSS1.p3.m5">
              <XMath>
                <XMTok meaning="times" role="MULOP">×</XMTok>
              </XMath>
            </Math>1<Math mode="inline" tex="\times" text="*" xml:id="S3.SS3.SSS1.p3.m6">
              <XMath>
                <XMTok meaning="times" role="MULOP">×</XMTok>
              </XMath>
            </Math>1 and 3<Math mode="inline" tex="\times" text="*" xml:id="S3.SS3.SSS1.p3.m7">
              <XMath>
                <XMTok meaning="times" role="MULOP">×</XMTok>
              </XMath>
            </Math>3<Math mode="inline" tex="\times" text="*" xml:id="S3.SS3.SSS1.p3.m8">
              <XMath>
                <XMTok meaning="times" role="MULOP">×</XMTok>
              </XMath>
            </Math>3) to capture diverse spatial patterns across different scales. It consists of multiple parallel convolutional branches to extract scale-specific features. The CA module is integrated after the concatenation layer of the 3D Inception v1 module within the I3D architecture. Another 1<Math mode="inline" tex="\times" text="*" xml:id="S3.SS3.SSS1.p3.m9">
              <XMath>
                <XMTok meaning="times" role="MULOP">×</XMTok>
              </XMath>
            </Math>3<Math mode="inline" tex="\times" text="*" xml:id="S3.SS3.SSS1.p3.m10">
              <XMath>
                <XMTok meaning="times" role="MULOP">×</XMTok>
              </XMath>
            </Math>3 max-pooling layer with strides of (1, 2, 2) is applied to downsample the feature map while preserving essential information. An additional 3D CA-Inception v1 module is then incorporated to further enhance the network’s ability to capture complex spatial patterns. The network concludes with a linear layer to facilitate nonlinear transformations.
<!--  %****␣manuscript.tex␣Line␣775␣**** -->All convolutional layers use rectified linear unit (ReLU) activation, and network weights are initialized randomly following a standard normal distribution with a mean of 0 and a variance of 1. To prevent the loss of low-level image features typically associated with pooling operations, the first max-pooling layer was removed. Additionally, the final average pooling layer was eliminated, retaining only the convolutional layers. This modification not only reduces the number of parameters but also preserves global image information, enhancing the network’s robustness. Furthermore, to mitigate the risk of overfitting, the number of Inception modules was reduced from nine to two.
<!--  %为了更好地满足␣MER␣任务的要求，我们优化了原来的␣I3D␣网络架构。该模型从一个␣3$\times$3$\times$3␣卷积层开始，用于空间特征提取，然后是一个␣1$\times$3$\times$3␣最大池化层，步幅为␣（1，␣2，␣2），该层沿高度和宽度维度执行池化，同时保留通道和时间维度。␣␣3D␣CA-Inception␣v1␣模块采用多个不同大小的卷积滤波器（例如，1$\times$1$\times$1␣和␣3$\times$3$\times$3）来捕获不同尺度的不同空间模式。它由多个并行卷积分支组成，用于提取特定于尺度的特征。CA␣模块集成在␣I3D␣架构中␣3D␣Inception␣v1␣模块的串联层之后。另一个步幅为␣（1，␣2，␣2）␣的␣1$\times$3$\times$3␣最大池化层用于对特征图进行下采样，同时保留基本信息。然后，加入一个额外的␣3D␣CA-Inception␣v1␣模块，以进一步增强网络捕获复杂空间模式的能力。该网络以线性层结束，以促进非线性变换。␣␣所有卷积层都使用修正线性单元␣（ReLU）␣激活，并且网络权重按照标准正态分布随机初始化，平均值为␣0，方差为␣1。为了防止通常与池化作相关的低级图像特征的丢失，删除了第一个␣max-pooling␣层。此外，最终的平均池化层被消除，只保留卷积层。这种修改不仅减少了参数的数量，而且保留了全局图像信息，增强了网络的鲁棒性。此外，为了降低过度拟合的风险，Inception␣模块的数量从␣9␣个减少到␣2␣个。 --></p>
        </para>
      </subsubsection>
      <subsubsection inlist="toc" xml:id="S3.SS3.SSS2">
        <tags>
          <tag>3.3.2</tag>
          <tag role="autoref">subsubsection 3.3.2</tag>
          <tag role="refnum">3.3.2</tag>
          <tag role="typerefnum">§3.3.2</tag>
        </tags>
        <title><tag close=" ">3.3.2</tag>Pre-training of the GFE</title>
        <para xml:id="S3.SS3.SSS2.p1">
          <p>The pre-training of the GFE is achieved through prior learning based on a triplet network, enabling the GFE to acquire the capability of extracting general features for MER. This capability allows the model to effectively distinguish similarities and differences between samples of different categories. Based on the assumption that ME samples from the same category should form tight clusters in the embedding space, we constructed a triplet network to ensure that samples with the same label are closely positioned in the embedding space, while samples with different labels are positioned farther apart. The triplet network consists of three CA-I3D components that share the same architecture and parameters. The input of the network is a series of triple samples defined as <Math mode="inline" tex="[x_{a},x_{p},x_{n}]" text="list@(x _ a, x _ p, x _ n)" xml:id="S3.SS3.SSS2.p1.m1">
              <XMath>
                <XMDual>
                  <XMApp>
                    <XMTok meaning="list"/>
                    <XMRef idref="S3.SS3.SSS2.p1.m1.1"/>
                    <XMRef idref="S3.SS3.SSS2.p1.m1.2"/>
                    <XMRef idref="S3.SS3.SSS2.p1.m1.3"/>
                  </XMApp>
                  <XMWrap>
                    <XMTok role="OPEN" stretchy="false">[</XMTok>
                    <XMApp xml:id="S3.SS3.SSS2.p1.m1.1">
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">a</XMTok>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                    <XMApp xml:id="S3.SS3.SSS2.p1.m1.2">
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">p</XMTok>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                    <XMApp xml:id="S3.SS3.SSS2.p1.m1.3">
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">n</XMTok>
                    </XMApp>
                    <XMTok role="CLOSE" stretchy="false">]</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>, which consists of an anchor sample <Math mode="inline" tex="x_{a}" text="x _ a" xml:id="S3.SS3.SSS2.p1.m2">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">a</XMTok>
                </XMApp>
              </XMath>
            </Math>, a positive sample <Math mode="inline" tex="x_{p}" text="x _ p" xml:id="S3.SS3.SSS2.p1.m3">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">p</XMTok>
                </XMApp>
              </XMath>
            </Math> from the same category, and a negative sample <Math mode="inline" tex="x_{n}" text="x _ n" xml:id="S3.SS3.SSS2.p1.m4">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">n</XMTok>
                </XMApp>
              </XMath>
            </Math> from a different category. During the training process, for an input triplet sample, the model will output three feature vectors <Math mode="inline" tex="f(x_{a})" text="f * x _ a" xml:id="S3.SS3.SSS2.p1.m5">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">f</XMTok>
                  <XMDual>
                    <XMRef idref="S3.SS3.SSS2.p1.m5.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS3.SSS2.p1.m5.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">a</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math>, <Math mode="inline" tex="f(x_{p})" text="f * x _ p" xml:id="S3.SS3.SSS2.p1.m6">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">f</XMTok>
                  <XMDual>
                    <XMRef idref="S3.SS3.SSS2.p1.m6.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS3.SSS2.p1.m6.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">p</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> and <Math mode="inline" tex="f(x_{n})" text="f * x _ n" xml:id="S3.SS3.SSS2.p1.m7">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">f</XMTok>
                  <XMDual>
                    <XMRef idref="S3.SS3.SSS2.p1.m7.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS3.SSS2.p1.m7.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">n</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> after the processing of CA-I3D modules. We employ a triplet loss function to learn discriminative feature embedding, such that the embedded distance of the positive pair (images of the same class) is closer than that of the negative pair (images of different classes) by a distance margin. The triplet loss is presented as follows:
<!--  %GFE␣的预训练是通过基于三重网络的先验学习实现的，这使得␣GFE␣具备了为␣MER␣提取通用特征的能力，这种能力使得模型能够有效区分不同类别样本的相似性和差异性。基于同一类别的␣ME␣样本在␣embedding␣空间中应该形成紧密的簇这一假设，我们构建了三重网络，以确保具有相同标签的样本在␣embedding␣空间中位置紧密，而具有不同标签的样本位置相距较远。三元组网络由三个共享相同架构和参数的CA-I3D组件组成。网络的输入是一系列三元组样本，定义为$[x_a,x_p,x_n]$，由一个锚样本$x_a$、一个来自同一类别的正样本$x_p$和一个来自不同类别的负样本$x_n$组成。在训练过程中，对于输入的三元组样本，经过CA-I3D模块的处理后，模型将输出三个特征向量$f(x_a)$，$f(x_p)$和$f(x_n)$。我们使用三元组损失函数来学习判别性特征嵌入，使得正对（同一类别的图像）的嵌入距离比负对（不同类别的图像）的嵌入距离近一个距离范围。三元组损失表示如下： --></p>
        </para>
        <para xml:id="S3.SS3.SSS2.p2">
          <equation xml:id="S3.E6">
            <tags>
              <tag>(6)</tag>
              <tag role="autoref">Equation 6</tag>
              <tag role="refnum">6</tag>
            </tags>
            <Math mode="display" tex="\small\mathcal{}{L_{t}}=\sum_{x_{a}}\max\left(d\left(f\left(x_{a}\right),f%&#10;\left(x_{p}\right)\right)^{2}-d\left(f\left(x_{a}\right),f\left(x_{n}\right)%&#10;\right)^{2}+\alpha,0\right)," text="L _ t = (sum _ x _ a)@(maximum@((d * (open-interval@(f * x _ a, f * x _ p)) ^ 2 - d * (open-interval@(f * x _ a, f * x _ n)) ^ 2) + alpha, 0))" xml:id="S3.E6.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S3.E6.m1.3"/>
                  <XMWrap>
                    <XMApp xml:id="S3.E6.m1.3">
                      <XMTok fontsize="90%" meaning="equals" role="RELOP">=</XMTok>
                      <XMApp>
                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                        <XMTok font="italic" fontsize="90%" role="UNKNOWN">L</XMTok>
                        <XMTok font="italic" fontsize="63%" role="UNKNOWN">t</XMTok>
                      </XMApp>
                      <XMApp>
                        <XMApp scriptpos="mid">
                          <XMTok role="SUBSCRIPTOP" scriptpos="mid1"/>
                          <XMTok fontsize="90%" mathstyle="display" meaning="sum" role="SUMOP" scriptpos="mid">∑</XMTok>
                          <XMApp>
                            <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                            <XMTok font="italic" fontsize="63%" role="UNKNOWN">x</XMTok>
                            <XMTok font="italic" fontsize="45%" role="UNKNOWN">a</XMTok>
                          </XMApp>
                        </XMApp>
                        <XMDual>
                          <XMApp>
                            <XMRef idref="S3.E6.m1.1"/>
                            <XMRef idref="S3.E6.m1.3.1"/>
                            <XMRef idref="S3.E6.m1.2"/>
                          </XMApp>
                          <XMApp>
                            <XMTok fontsize="90%" meaning="maximum" role="OPFUNCTION" scriptpos="mid" xml:id="S3.E6.m1.1">max</XMTok>
                            <XMWrap>
                              <XMTok fontsize="90%" role="OPEN" stretchy="true">(</XMTok>
                              <XMApp xml:id="S3.E6.m1.3.1">
                                <XMTok fontsize="90%" meaning="plus" role="ADDOP">+</XMTok>
                                <XMApp>
                                  <XMTok fontsize="90%" meaning="minus" role="ADDOP">-</XMTok>
                                  <XMApp>
                                    <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                    <XMTok font="italic" fontsize="90%" role="UNKNOWN">d</XMTok>
                                    <XMApp>
                                      <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                      <XMDual>
                                        <XMApp>
                                          <XMTok meaning="open-interval"/>
                                          <XMRef idref="S3.E6.m1.3.1.1"/>
                                          <XMRef idref="S3.E6.m1.3.1.2"/>
                                        </XMApp>
                                        <XMWrap>
                                          <XMTok fontsize="90%" role="OPEN" stretchy="true">(</XMTok>
                                          <XMApp xml:id="S3.E6.m1.3.1.1">
                                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                            <XMTok font="italic" fontsize="90%" role="UNKNOWN">f</XMTok>
                                            <XMDual>
                                              <XMRef idref="S3.E6.m1.3.1.1.1"/>
                                              <XMWrap>
                                                <XMTok fontsize="90%" role="OPEN" stretchy="true">(</XMTok>
                                                <XMApp xml:id="S3.E6.m1.3.1.1.1">
                                                  <XMTok role="SUBSCRIPTOP" scriptpos="post4"/>
                                                  <XMTok font="italic" fontsize="90%" role="UNKNOWN">x</XMTok>
                                                  <XMTok font="italic" fontsize="63%" role="UNKNOWN">a</XMTok>
                                                </XMApp>
                                                <XMTok fontsize="90%" role="CLOSE" stretchy="true">)</XMTok>
                                              </XMWrap>
                                            </XMDual>
                                          </XMApp>
                                          <XMTok fontsize="90%" role="PUNCT">,</XMTok>
                                          <XMApp xml:id="S3.E6.m1.3.1.2">
                                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                            <XMTok font="italic" fontsize="90%" role="UNKNOWN">f</XMTok>
                                            <XMDual>
                                              <XMRef idref="S3.E6.m1.3.1.2.1"/>
                                              <XMWrap>
                                                <XMTok fontsize="90%" role="OPEN" stretchy="true">(</XMTok>
                                                <XMApp xml:id="S3.E6.m1.3.1.2.1">
                                                  <XMTok role="SUBSCRIPTOP" scriptpos="post4"/>
                                                  <XMTok font="italic" fontsize="90%" role="UNKNOWN">x</XMTok>
                                                  <XMTok font="italic" fontsize="63%" role="UNKNOWN">p</XMTok>
                                                </XMApp>
                                                <XMTok fontsize="90%" role="CLOSE" stretchy="true">)</XMTok>
                                              </XMWrap>
                                            </XMDual>
                                          </XMApp>
                                          <XMTok fontsize="90%" role="CLOSE" stretchy="true">)</XMTok>
                                        </XMWrap>
                                      </XMDual>
                                      <XMTok fontsize="63%" meaning="2" role="NUMBER">2</XMTok>
                                    </XMApp>
                                  </XMApp>
                                  <XMApp>
                                    <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                    <XMTok font="italic" fontsize="90%" role="UNKNOWN">d</XMTok>
                                    <XMApp>
                                      <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                      <XMDual>
                                        <XMApp>
                                          <XMTok meaning="open-interval"/>
                                          <XMRef idref="S3.E6.m1.3.1.3"/>
                                          <XMRef idref="S3.E6.m1.3.1.4"/>
                                        </XMApp>
                                        <XMWrap>
                                          <XMTok fontsize="90%" role="OPEN" stretchy="true">(</XMTok>
                                          <XMApp xml:id="S3.E6.m1.3.1.3">
                                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                            <XMTok font="italic" fontsize="90%" role="UNKNOWN">f</XMTok>
                                            <XMDual>
                                              <XMRef idref="S3.E6.m1.3.1.3.1"/>
                                              <XMWrap>
                                                <XMTok fontsize="90%" role="OPEN" stretchy="true">(</XMTok>
                                                <XMApp xml:id="S3.E6.m1.3.1.3.1">
                                                  <XMTok role="SUBSCRIPTOP" scriptpos="post4"/>
                                                  <XMTok font="italic" fontsize="90%" role="UNKNOWN">x</XMTok>
                                                  <XMTok font="italic" fontsize="63%" role="UNKNOWN">a</XMTok>
                                                </XMApp>
                                                <XMTok fontsize="90%" role="CLOSE" stretchy="true">)</XMTok>
                                              </XMWrap>
                                            </XMDual>
                                          </XMApp>
                                          <XMTok fontsize="90%" role="PUNCT">,</XMTok>
                                          <XMApp xml:id="S3.E6.m1.3.1.4">
                                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                            <XMTok font="italic" fontsize="90%" role="UNKNOWN">f</XMTok>
                                            <XMDual>
                                              <XMRef idref="S3.E6.m1.3.1.4.1"/>
                                              <XMWrap>
                                                <XMTok fontsize="90%" role="OPEN" stretchy="true">(</XMTok>
                                                <XMApp xml:id="S3.E6.m1.3.1.4.1">
                                                  <XMTok role="SUBSCRIPTOP" scriptpos="post4"/>
                                                  <XMTok font="italic" fontsize="90%" role="UNKNOWN">x</XMTok>
                                                  <XMTok font="italic" fontsize="63%" role="UNKNOWN">n</XMTok>
                                                </XMApp>
                                                <XMTok fontsize="90%" role="CLOSE" stretchy="true">)</XMTok>
                                              </XMWrap>
                                            </XMDual>
                                          </XMApp>
                                          <XMTok fontsize="90%" role="CLOSE" stretchy="true">)</XMTok>
                                        </XMWrap>
                                      </XMDual>
                                      <XMTok fontsize="63%" meaning="2" role="NUMBER">2</XMTok>
                                    </XMApp>
                                  </XMApp>
                                </XMApp>
                                <XMTok font="italic" fontsize="90%" name="alpha" role="UNKNOWN">α</XMTok>
                              </XMApp>
                              <XMTok fontsize="90%" role="PUNCT">,</XMTok>
                              <XMTok fontsize="90%" meaning="0" role="NUMBER" xml:id="S3.E6.m1.2">0</XMTok>
                              <XMTok fontsize="90%" role="CLOSE" stretchy="true">)</XMTok>
                            </XMWrap>
                          </XMApp>
                        </XMDual>
                      </XMApp>
                    </XMApp>
                    <XMTok fontsize="90%" role="PUNCT">,</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
<!--  %where␣$\alpha␣$␣is␣a␣hyper-parameter␣that␣controls␣the␣margin␣between␣the␣positive␣and␣negative␣distances,␣$x_p$␣and␣$x_n$,␣i.e.,␣the␣positive␣and␣negative␣samples␣selected␣randomly␣from␣two␣different␣classes,␣together␣with␣the␣anchor␣sample␣$x_a(x_p\neq\\x_a)$␣from␣the␣same␣class␣as␣$x_p$,␣compose␣a␣triplet.␣$d(f(x_{a}),f(x_{n}))=\left␣\|␣f(x_{a})-f(x_{n})\right␣\|␣_{2}$␣is␣the␣$L_2-norm$␣distance,␣and␣$f(x_a)$␣is␣the␣embedded␣feature␣vector,␣i.e.␣network␣output␣of␣the␣fully␣connected␣(FC)␣layer␣of␣the␣anchor␣$(x_a)$␣sample.␣$[\cdot]_+$␣is␣the␣hinge␣function␣and␣formulated␣as␣$max(\cdot,0)$. -->          <p>where <Math mode="inline" tex="\alpha" text="alpha" xml:id="S3.SS3.SSS2.p2.m1">
              <XMath>
                <XMTok font="italic" name="alpha" role="UNKNOWN">α</XMTok>
              </XMath>
            </Math> is a hyperparameter that controls the margin between the distances of <Math mode="inline" tex="x_{p}" text="x _ p" xml:id="S3.SS3.SSS2.p2.m2">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">p</XMTok>
                </XMApp>
              </XMath>
            </Math> and <Math mode="inline" tex="x_{n}" text="x _ n" xml:id="S3.SS3.SSS2.p2.m3">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">n</XMTok>
                </XMApp>
              </XMath>
            </Math>. We use the Euclidean distance between ME features as a metric, which is described as follows:
<!--  %这里，$alpha$␣是一个超参数，它控制␣$x_p$␣和␣$x_n$␣之间的距离。我们使用␣ME␣特征之间的欧几里得距离作为度量，可以描述为： --></p>
        </para>
        <para xml:id="S3.SS3.SSS2.p3">
          <equation xml:id="S3.E7">
            <tags>
              <tag>(7)</tag>
              <tag role="autoref">Equation 7</tag>
              <tag role="refnum">7</tag>
            </tags>
            <Math mode="display" tex="\mathcal{}d(f(x_{a}),f(x_{n}))=\left\|f(x_{a})-f(x_{n})\right\|_{2}." text="d * open-interval@(f * x _ a, f * x _ n) = (norm@(f * x _ a - f * x _ n)) _ 2" xml:id="S3.E7.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S3.E7.m1.1"/>
                  <XMWrap>
                    <XMApp xml:id="S3.E7.m1.1">
                      <XMTok meaning="equals" role="RELOP">=</XMTok>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
<!--  %\small -->                        <XMTok font="italic" role="UNKNOWN">d</XMTok>
                        <XMDual>
                          <XMApp>
                            <XMTok meaning="open-interval"/>
                            <XMRef idref="S3.E7.m1.1.1"/>
                            <XMRef idref="S3.E7.m1.1.2"/>
                          </XMApp>
                          <XMWrap>
                            <XMTok role="OPEN" stretchy="false">(</XMTok>
                            <XMApp xml:id="S3.E7.m1.1.1">
                              <XMTok meaning="times" role="MULOP">⁢</XMTok>
                              <XMTok font="italic" role="UNKNOWN">f</XMTok>
                              <XMDual>
                                <XMRef idref="S3.E7.m1.1.1.1"/>
                                <XMWrap>
                                  <XMTok role="OPEN" stretchy="false">(</XMTok>
                                  <XMApp xml:id="S3.E7.m1.1.1.1">
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                    <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">a</XMTok>
                                  </XMApp>
                                  <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                </XMWrap>
                              </XMDual>
                            </XMApp>
                            <XMTok role="PUNCT">,</XMTok>
                            <XMApp xml:id="S3.E7.m1.1.2">
                              <XMTok meaning="times" role="MULOP">⁢</XMTok>
                              <XMTok font="italic" role="UNKNOWN">f</XMTok>
                              <XMDual>
                                <XMRef idref="S3.E7.m1.1.2.1"/>
                                <XMWrap>
                                  <XMTok role="OPEN" stretchy="false">(</XMTok>
                                  <XMApp xml:id="S3.E7.m1.1.2.1">
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                    <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">n</XMTok>
                                  </XMApp>
                                  <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                </XMWrap>
                              </XMDual>
                            </XMApp>
                            <XMTok role="CLOSE" stretchy="false">)</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMApp>
                      <XMApp>
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMDual>
                          <XMApp>
                            <XMTok meaning="norm"/>
                            <XMRef idref="S3.E7.m1.1.3"/>
                          </XMApp>
                          <XMWrap>
                            <XMTok role="VERTBAR" stretchy="true">∥</XMTok>
                            <XMApp xml:id="S3.E7.m1.1.3">
                              <XMTok meaning="minus" role="ADDOP">-</XMTok>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                <XMDual>
                                  <XMRef idref="S3.E7.m1.1.3.1"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E7.m1.1.3.1">
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">a</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                <XMDual>
                                  <XMRef idref="S3.E7.m1.1.3.2"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E7.m1.1.3.2">
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">n</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                            </XMApp>
                            <XMTok role="VERTBAR" stretchy="true">∥</XMTok>
                          </XMWrap>
                        </XMDual>
                        <XMTok fontsize="70%" meaning="2" role="NUMBER">2</XMTok>
                      </XMApp>
                    </XMApp>
                    <XMTok role="PERIOD">.</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
        </para>
<!--  %The␣positive␣sample␣$x_p$␣and␣the␣negative␣sample␣$x_n$␣are␣selected␣randomly␣from␣two␣different␣classes,␣while␣the␣anchor␣sample␣$(x_a)$␣is␣chosen␣from␣the␣same␣class␣as␣$x_p$,␣forming␣a␣triplet. 
     %and␣$f(x_a)$␣denotes␣the␣embedded␣feature␣vector,␣which␣is␣the␣output␣of␣the␣fully␣connected␣(FC)␣layer␣of␣the␣anchor␣sample␣$(x_a)$.␣$[\cdot]_+$␣is␣the␣hinge␣function␣and␣formulated␣as␣$max(\cdot,0)$.-->        <para xml:id="S3.SS3.SSS2.p4">
          <p>After several epochs of training on the ME dataset, the network reached a fixed error over the triplet comparisons. Then we use the parameters of the convolutional layer of the trained CA-I3D model to initialize our GFE to extract the generic features of MEs.
<!--  %****␣manuscript.tex␣Line␣800␣**** 
     %在␣ME␣数据集上训练了几个时期后，网络在三元组比较中达到了固定误差。然后，我们使用训练后的␣CA-I3D␣模型的卷积层参数来初始化我们的␣GFE，以提取␣ME␣的通用特征。--></p>
        </para>
<!--  %by␣inputting␣triplet␣ME␣samples␣(anchor,␣positive,␣negative)␣and␣judging␣their␣similarity. 
     %For␣each␣anchor␣sample,␣we␣randomly␣select␣a␣positive␣sample␣from␣the␣same␣category␣to␣form␣a␣positive␣sample␣pair,␣and␣then␣randomly␣select␣a␣negative␣sample␣from␣a␣different␣category␣to␣form␣a␣negative␣sample␣pair.-->      </subsubsection>
      <subsubsection inlist="toc" xml:id="S3.SS3.SSS3">
        <tags>
          <tag>3.3.3</tag>
          <tag role="autoref">subsubsection 3.3.3</tag>
          <tag role="refnum">3.3.3</tag>
          <tag role="typerefnum">§3.3.3</tag>
        </tags>
        <title><tag close=" ">3.3.3</tag>Pre-training of the AFE</title>
<!--  %表1 -->        <table inlist="lot" labels="LABEL:table:equalization" placement="t" xml:id="S3.T1">
          <tags>
            <tag><text fontsize="120%">TABLE I</text></tag>
            <tag role="autoref"><text fontsize="120%">Table I</text></tag>
            <tag role="refnum"><text fontsize="120%">I</text></tag>
            <tag role="typerefnum"><text fontsize="120%">TABLE I</text></tag>
          </tags>
          <toccaption><tag close=" "><text fontsize="120%">I</text></tag><text fontsize="120%">The process of sample equalization for the ME datasets used in this study</text></toccaption>
          <caption fontsize="120%"><tag close=": ">TABLE I</tag>The process of sample equalization for the ME datasets used in this study</caption>
          <block depth="0.0pt" width="411.9pt">
            <tabular class="ltx_guessed_headers" vattach="middle">
              <thead>
                <tr>
                  <td align="center" border="tt t" rowspan="2" thead="column row"><text fontsize="120%">Category</text></td>
                  <td align="center" border="tt t" colspan="3" thead="column"><text fontsize="120%">SMIC-HS </text><cite class="ltx_citemacro_cite"><text fontsize="120%">[</text><bibref bibrefs="pfister2011recognising" separator="," yyseparator=","/><text fontsize="120%">]</text></cite></td>
                  <td align="center" border="tt t" colspan="3" thead="column"><text fontsize="120%">CASME II </text><cite class="ltx_citemacro_cite"><text fontsize="120%">[</text><bibref bibrefs="yan2014casme" separator="," yyseparator=","/><text fontsize="120%">]</text></cite></td>
                  <td align="center" border="tt t" colspan="3" thead="column"><text fontsize="120%">SAMM </text><cite class="ltx_citemacro_cite"><text fontsize="120%">[</text><bibref bibrefs="davison2016samm" separator="," yyseparator=","/><text fontsize="120%">]</text></cite></td>
                  <td align="center" border="tt t" colspan="2" thead="column"><text fontsize="120%">MEGC2019-CD </text><cite class="ltx_citemacro_cite"><text fontsize="120%">[</text><bibref bibrefs="see2019megc" separator="," yyseparator=","/><text fontsize="120%">]</text></cite></td>
                </tr>
                <tr>
                  <td align="justify" border="t" thead="column" width="39.8pt"><text fontsize="120%">Raw-MEs</text></td>
                  <td align="justify" border="t" thead="column" width="65.4pt"><text fontsize="120%">Amplified-MEs</text></td>
                  <td align="justify" border="t" thead="column" width="22.8pt"><Math class="ltx_centering" mode="inline" tex="\varphi" text="varphi" xml:id="S3.T1.m1">
                      <XMath>
                        <XMTok font="italic" fontsize="120%" name="varphi" role="UNKNOWN">φ</XMTok>
                      </XMath>
                    </Math></td>
                  <td align="justify" border="t" thead="column" width="42.7pt"><text fontsize="120%">Raw-MEs</text></td>
                  <td align="justify" border="t" thead="column" width="65.4pt"><text fontsize="120%">Amplified-MEs</text></td>
                  <td align="justify" border="t" thead="column" width="22.8pt"><Math class="ltx_centering" mode="inline" tex="\varphi" text="varphi" xml:id="S3.T1.m2">
                      <XMath>
                        <XMTok font="italic" fontsize="120%" name="varphi" role="UNKNOWN">φ</XMTok>
                      </XMath>
                    </Math></td>
                  <td align="justify" border="t" thead="column" width="42.7pt"><text fontsize="120%">Raw-MEs</text></td>
                  <td align="justify" border="t" thead="column" width="65.4pt"><text fontsize="120%">Amplified-MEs</text></td>
                  <td align="justify" border="t" thead="column" width="22.8pt"><Math class="ltx_centering" mode="inline" tex="\varphi" text="varphi" xml:id="S3.T1.m3">
                      <XMath>
                        <XMTok font="italic" fontsize="120%" name="varphi" role="UNKNOWN">φ</XMTok>
                      </XMath>
                    </Math></td>
                  <td align="justify" border="t" thead="column" width="42.7pt"><text fontsize="120%">Raw-MEs</text></td>
                  <td align="justify" border="t" thead="column" width="65.4pt"><text fontsize="120%">Amplified-MEs</text></td>
                </tr>
              </thead>
              <tbody>
                <tr>
                  <td align="center" border="t" thead="row"><text fontsize="120%">Negative</text></td>
                  <td align="justify" border="t" width="39.8pt"><text fontsize="120%">70</text></td>
                  <td align="justify" border="t" width="65.4pt"><text fontsize="120%">630</text></td>
                  <td align="justify" border="t" width="22.8pt"><text fontsize="120%">1</text><Math class="ltx_centering" mode="inline" tex="\sim" text="similar-to" xml:id="S3.T1.m4">
                      <XMath>
                        <XMTok fontsize="120%" meaning="similar-to" name="sim" role="RELOP">∼</XMTok>
                      </XMath>
                    </Math><text fontsize="120%">8</text></td>
                  <td align="justify" border="t" width="42.7pt"><text fontsize="120%">88</text></td>
                  <td align="justify" border="t" width="65.4pt"><text fontsize="120%">440</text></td>
                  <td align="justify" border="t" width="22.8pt"><text fontsize="120%">1</text><Math class="ltx_centering" mode="inline" tex="\sim" text="similar-to" xml:id="S3.T1.m5">
                      <XMath>
                        <XMTok fontsize="120%" meaning="similar-to" name="sim" role="RELOP">∼</XMTok>
                      </XMath>
                    </Math><text fontsize="120%">4</text></td>
                  <td align="justify" border="t" width="42.7pt"><text fontsize="120%">92</text></td>
                  <td align="justify" border="t" width="65.4pt"><text fontsize="120%">368</text></td>
                  <td align="justify" border="t" width="22.8pt"><text fontsize="120%">1</text><Math class="ltx_centering" mode="inline" tex="\sim" text="similar-to" xml:id="S3.T1.m6">
                      <XMath>
                        <XMTok fontsize="120%" meaning="similar-to" name="sim" role="RELOP">∼</XMTok>
                      </XMath>
                    </Math><text fontsize="120%">3</text></td>
                  <td align="justify" border="t" width="42.7pt"><text fontsize="120%">250</text></td>
                  <td align="justify" border="t" width="65.4pt"><text fontsize="120%">1438</text></td>
                </tr>
                <tr>
                  <td align="center" thead="row"><text fontsize="120%">Positive</text></td>
                  <td align="justify" width="39.8pt"><text fontsize="120%">51</text></td>
                  <td align="justify" width="65.4pt"><text fontsize="120%">663</text></td>
                  <td align="justify" width="22.8pt"><text fontsize="120%">1</text><Math class="ltx_centering" mode="inline" tex="\sim" text="similar-to" xml:id="S3.T1.m7">
                      <XMath>
                        <XMTok fontsize="120%" meaning="similar-to" name="sim" role="RELOP">∼</XMTok>
                      </XMath>
                    </Math><text fontsize="120%">12</text></td>
                  <td align="justify" width="42.7pt"><text fontsize="120%">32</text></td>
                  <td align="justify" width="65.4pt"><text fontsize="120%">416</text></td>
                  <td align="justify" width="22.8pt"><text fontsize="120%">1</text><Math class="ltx_centering" mode="inline" tex="\sim" text="similar-to" xml:id="S3.T1.m8">
                      <XMath>
                        <XMTok fontsize="120%" meaning="similar-to" name="sim" role="RELOP">∼</XMTok>
                      </XMath>
                    </Math><text fontsize="120%">12</text></td>
                  <td align="justify" width="42.7pt"><text fontsize="120%">26</text></td>
                  <td align="justify" width="65.4pt"><text fontsize="120%">260</text></td>
                  <td align="justify" width="22.8pt"><text fontsize="120%">1</text><Math class="ltx_centering" mode="inline" tex="\sim" text="similar-to" xml:id="S3.T1.m9">
                      <XMath>
                        <XMTok fontsize="120%" meaning="similar-to" name="sim" role="RELOP">∼</XMTok>
                      </XMath>
                    </Math><text fontsize="120%">9</text></td>
                  <td align="justify" width="42.7pt"><text fontsize="120%">109</text></td>
                  <td align="justify" width="65.4pt"><text fontsize="120%">1339</text></td>
                </tr>
                <tr>
                  <td align="center" thead="row"><text fontsize="120%">Surprise</text></td>
                  <td align="justify" width="39.8pt"><text fontsize="120%">43</text></td>
                  <td align="justify" width="65.4pt"><text fontsize="120%">645</text></td>
                  <td align="justify" width="22.8pt"><text fontsize="120%">1</text><Math class="ltx_centering" mode="inline" tex="\sim" text="similar-to" xml:id="S3.T1.m10">
                      <XMath>
                        <XMTok fontsize="120%" meaning="similar-to" name="sim" role="RELOP">∼</XMTok>
                      </XMath>
                    </Math><text fontsize="120%">14</text></td>
                  <td align="justify" width="42.7pt"><text fontsize="120%">25</text></td>
                  <td align="justify" width="65.4pt"><text fontsize="120%">375</text></td>
                  <td align="justify" width="22.8pt"><text fontsize="120%">1</text><Math class="ltx_centering" mode="inline" tex="\sim" text="similar-to" xml:id="S3.T1.m11">
                      <XMath>
                        <XMTok fontsize="120%" meaning="similar-to" name="sim" role="RELOP">∼</XMTok>
                      </XMath>
                    </Math><text fontsize="120%">14</text></td>
                  <td align="justify" width="42.7pt"><text fontsize="120%">15</text></td>
                  <td align="justify" width="65.4pt"><text fontsize="120%">225</text></td>
                  <td align="justify" width="22.8pt"><text fontsize="120%">1</text><Math class="ltx_centering" mode="inline" tex="\sim" text="similar-to" xml:id="S3.T1.m12">
                      <XMath>
                        <XMTok fontsize="120%" meaning="similar-to" name="sim" role="RELOP">∼</XMTok>
                      </XMath>
                    </Math><text fontsize="120%">14</text></td>
                  <td align="justify" width="42.7pt"><text fontsize="120%">83</text></td>
                  <td align="justify" width="65.4pt"><text fontsize="120%">1245</text></td>
                </tr>
                <tr>
                  <td align="center" border="bb b" thead="row"><text fontsize="120%">Total</text></td>
                  <td align="justify" border="bb b" width="39.8pt"><text fontsize="120%">164</text></td>
                  <td align="justify" border="bb b" width="65.4pt"><text fontsize="120%">1938</text></td>
                  <td align="justify" border="bb b" width="22.8pt"><text fontsize="120%">–</text></td>
                  <td align="justify" border="bb b" width="42.7pt"><text fontsize="120%">145</text></td>
                  <td align="justify" border="bb b" width="65.4pt"><text fontsize="120%">1231</text></td>
                  <td align="justify" border="bb b" width="22.8pt"><text fontsize="120%">–</text></td>
                  <td align="justify" border="bb b" width="42.7pt"><text fontsize="120%">133</text></td>
                  <td align="justify" border="bb b" width="65.4pt"><text fontsize="120%">853</text></td>
                  <td align="justify" border="bb b" width="22.8pt"><text fontsize="120%">–</text></td>
                  <td align="justify" border="bb b" width="42.7pt"><text fontsize="120%">442</text></td>
                  <td align="justify" border="bb b" width="65.4pt"><text fontsize="120%">4022</text></td>
                </tr>
              </tbody>
            </tabular>
          </block>
        </table>
        <para xml:id="S3.SS3.SSS3.p1">
          <p>The pretrained GFE is capable of extracting general features of MEs. However, relying solely on these general features is insufficient for accurately modeling the fine-grained information required for MER tasks. To address this limitation, we designed and pretrained an additional feature encoder, termed the AFE, to enhance the model’s ability to capture advanced features essential for MER. These advanced features include subtle facial movement patterns, complex texture characteristics, and fine-grained motion details. The pretraining of the AFE is based on a prior learning strategy that utilizes a balanced, motion-amplified ME dataset. Specifically, we employed the video magnification model proposed by Tae-Hyun <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="oh2018learning" separator="," yyseparator=","/>]</cite>, using frame sequences from the onset to the apex frames of MEs as input. To address the imbalance in sample distribution across categories, we dynamically adjusted the magnification factors based on each category’s sample size. Categories with smaller sample sizes were assigned a greater range of magnification factors, allowing for the generation of more synthetic samples and improving dataset balance. This approach not only enhances the representation of subtle motions in videos but also effectively mitigates sample imbalance within the dataset.
<!--  %预训练的␣GFE␣能够提取␣ME␣的一般特征。但是，仅依靠这些一般特征不足以准确建模␣MER␣任务所需的细粒度信息。为了解决这一限制，我们设计并预训练了一个附加特征编码器（称为␣AFE），以增强模型捕获␣MER␣所必需的高级特征的能力。这些高级特征包括细微的面部运动模式、复杂的纹理特征和细粒度的运动细节。AFE␣的预训练基于先前的学习策略，该策略利用平衡的运动放大␣ME␣数据集。具体来说，我们采用了␣Tae-Hyun␣等人提出的视频放大模型。\cite{oh2018learning}，使用从␣ME␣的开始到顶点帧的帧序列作为输入。为了解决不同类别之间样本分布不平衡的问题，我们根据每个类别的样本大小动态调整放大因子。样本较少的类别被更频繁地放大，从而增加了代表性不足的类别的训练样本数量。这种方法不仅有效地增强了视频中的细微动作，而且还显著缓解了数据集中样本不平衡的问题。 -->As defined by Wu et al. <cite class="ltx_citemacro_cite">[<bibref bibrefs="wu2012eulerian" separator="," yyseparator=","/>]</cite> in their work on motion magnification, a single frame within a continuous video can be described as:
<!--  %正如吴等人\cite{wu2012eulerian}在其关于运动放大的工作中所定义的，连续视频中的单帧可以描述为： --></p>
        </para>
        <para xml:id="S3.SS3.SSS3.p2">
          <equation xml:id="S3.E8">
            <tags>
              <tag>(8)</tag>
              <tag role="autoref">Equation 8</tag>
              <tag role="refnum">8</tag>
            </tags>
            <Math mode="display" tex="\mathcal{}{I(x,t)}=f(x+\delta(x,t))," text="I * open-interval@(x, t) = f * (x + delta * open-interval@(x, t))" xml:id="S3.E8.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S3.E8.m1.5"/>
                  <XMWrap>
                    <XMApp xml:id="S3.E8.m1.5">
                      <XMTok meaning="equals" role="RELOP">=</XMTok>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMTok font="italic" role="UNKNOWN">I</XMTok>
                        <XMDual>
                          <XMApp>
                            <XMTok meaning="open-interval"/>
                            <XMRef idref="S3.E8.m1.1"/>
                            <XMRef idref="S3.E8.m1.2"/>
                          </XMApp>
                          <XMWrap>
                            <XMTok role="OPEN" stretchy="false">(</XMTok>
                            <XMTok font="italic" role="UNKNOWN" xml:id="S3.E8.m1.1">x</XMTok>
                            <XMTok role="PUNCT">,</XMTok>
                            <XMTok font="italic" role="UNKNOWN" xml:id="S3.E8.m1.2">t</XMTok>
                            <XMTok role="CLOSE" stretchy="false">)</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMApp>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMTok font="italic" role="UNKNOWN">f</XMTok>
                        <XMDual>
                          <XMRef idref="S3.E8.m1.5.1"/>
                          <XMWrap>
                            <XMTok role="OPEN" stretchy="false">(</XMTok>
                            <XMApp xml:id="S3.E8.m1.5.1">
                              <XMTok meaning="plus" role="ADDOP">+</XMTok>
                              <XMTok font="italic" role="UNKNOWN">x</XMTok>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMTok font="italic" name="delta" role="UNKNOWN">δ</XMTok>
                                <XMDual>
                                  <XMApp>
                                    <XMTok meaning="open-interval"/>
                                    <XMRef idref="S3.E8.m1.3"/>
                                    <XMRef idref="S3.E8.m1.4"/>
                                  </XMApp>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMTok font="italic" role="UNKNOWN" xml:id="S3.E8.m1.3">x</XMTok>
                                    <XMTok role="PUNCT">,</XMTok>
                                    <XMTok font="italic" role="UNKNOWN" xml:id="S3.E8.m1.4">t</XMTok>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                            </XMApp>
                            <XMTok role="CLOSE" stretchy="false">)</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMApp>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
          <p>where <Math mode="inline" tex="\delta(x,t)" text="delta * open-interval@(x, t)" xml:id="S3.SS3.SSS3.p2.m1">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" name="delta" role="UNKNOWN">δ</XMTok>
                  <XMDual>
                    <XMApp>
                      <XMTok meaning="open-interval"/>
                      <XMRef idref="S3.SS3.SSS3.p2.m1.1"/>
                      <XMRef idref="S3.SS3.SSS3.p2.m1.2"/>
                    </XMApp>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMTok font="italic" role="UNKNOWN" xml:id="S3.SS3.SSS3.p2.m1.1">x</XMTok>
                      <XMTok role="PUNCT">,</XMTok>
                      <XMTok font="italic" role="UNKNOWN" xml:id="S3.SS3.SSS3.p2.m1.2">t</XMTok>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> is the motion field at position <Math mode="inline" tex="x" text="x" xml:id="S3.SS3.SSS3.p2.m2">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">x</XMTok>
              </XMath>
            </Math> and time <Math mode="inline" tex="t" text="t" xml:id="S3.SS3.SSS3.p2.m3">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">t</XMTok>
              </XMath>
            </Math>.
By performing motion magnification on the original image, we can obtain image <Math mode="inline" tex="I_{mag}" text="I _ (m * a * g)" xml:id="S3.SS3.SSS3.p2.m4">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">I</XMTok>
                  <XMApp>
                    <XMTok meaning="times" role="MULOP">⁢</XMTok>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">m</XMTok>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">a</XMTok>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">g</XMTok>
                  </XMApp>
                </XMApp>
              </XMath>
            </Math>:
<!--  %其中$\delta(x,t)$为位置$x$、时间$t$处的运动场。对原图进行运动放大，可得到图像$I_{mag}$： --></p>
        </para>
<!--  %****␣manuscript.tex␣Line␣850␣**** -->        <para xml:id="S3.SS3.SSS3.p3">
          <equation xml:id="S3.E9">
            <tags>
              <tag>(9)</tag>
              <tag role="autoref">Equation 9</tag>
              <tag role="refnum">9</tag>
            </tags>
            <Math mode="display" tex="\mathcal{}{I_{mag}(x,t)}=f(x+(1+\varphi)\delta(x,t))," text="I _ (m * a * g) * open-interval@(x, t) = f * (x + (1 + varphi) * delta * open-interval@(x, t))" xml:id="S3.E9.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S3.E9.m1.5"/>
                  <XMWrap>
                    <XMApp xml:id="S3.E9.m1.5">
                      <XMTok meaning="equals" role="RELOP">=</XMTok>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMApp>
                          <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                          <XMTok font="italic" role="UNKNOWN">I</XMTok>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">m</XMTok>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">a</XMTok>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">g</XMTok>
                          </XMApp>
                        </XMApp>
                        <XMDual>
                          <XMApp>
                            <XMTok meaning="open-interval"/>
                            <XMRef idref="S3.E9.m1.1"/>
                            <XMRef idref="S3.E9.m1.2"/>
                          </XMApp>
                          <XMWrap>
                            <XMTok role="OPEN" stretchy="false">(</XMTok>
                            <XMTok font="italic" role="UNKNOWN" xml:id="S3.E9.m1.1">x</XMTok>
                            <XMTok role="PUNCT">,</XMTok>
                            <XMTok font="italic" role="UNKNOWN" xml:id="S3.E9.m1.2">t</XMTok>
                            <XMTok role="CLOSE" stretchy="false">)</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMApp>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMTok font="italic" role="UNKNOWN">f</XMTok>
                        <XMDual>
                          <XMRef idref="S3.E9.m1.5.1"/>
                          <XMWrap>
                            <XMTok role="OPEN" stretchy="false">(</XMTok>
                            <XMApp xml:id="S3.E9.m1.5.1">
                              <XMTok meaning="plus" role="ADDOP">+</XMTok>
                              <XMTok font="italic" role="UNKNOWN">x</XMTok>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMDual>
                                  <XMRef idref="S3.E9.m1.5.1.1"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E9.m1.5.1.1">
                                      <XMTok meaning="plus" role="ADDOP">+</XMTok>
                                      <XMTok meaning="1" role="NUMBER">1</XMTok>
                                      <XMTok font="italic" name="varphi" role="UNKNOWN">φ</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                                <XMTok font="italic" name="delta" role="UNKNOWN">δ</XMTok>
                                <XMDual>
                                  <XMApp>
                                    <XMTok meaning="open-interval"/>
                                    <XMRef idref="S3.E9.m1.3"/>
                                    <XMRef idref="S3.E9.m1.4"/>
                                  </XMApp>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMTok font="italic" role="UNKNOWN" xml:id="S3.E9.m1.3">x</XMTok>
                                    <XMTok role="PUNCT">,</XMTok>
                                    <XMTok font="italic" role="UNKNOWN" xml:id="S3.E9.m1.4">t</XMTok>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                            </XMApp>
                            <XMTok role="CLOSE" stretchy="false">)</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMApp>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
          <p>where <Math mode="inline" tex="\varphi" text="varphi" xml:id="S3.SS3.SSS3.p3.m1">
              <XMath>
                <XMTok font="italic" name="varphi" role="UNKNOWN">φ</XMTok>
              </XMath>
            </Math> is the magnification factor.</p>
        </para>
        <para xml:id="S3.SS3.SSS3.p4">
          <p>The sample balancing process of the ME dataset is shown in Table <ref labelref="LABEL:table:equalization"/>. It is evident that the original ME dataset suffers from significant class imbalance. For instance, in the CASME II dataset, the “negative” class contains 88 samples, while the “surprise” class has only 25 samples. Similarly, in the SAMM dataset, the “negative” class consists of 92 samples, while the “disgust” class contains just 15 samples. By applying different combinations of magnification factors, we achieved three key objectives: (i) significantly increased the sample size (from 442 samples to 4022 samples), (ii) effectively mitigated the class imbalance issue, and (iii) enhanced the visibility of facial muscle movements. Experimental results indicate that excessive magnification (especially beyond a factor of 15) leads to severe facial distortions and reduced image quality. Therefore, we set the optimal maximum magnification factor to <Math mode="inline" tex="\varphi" text="varphi" xml:id="S3.SS3.SSS3.p4.m1">
              <XMath>
                <XMTok font="italic" name="varphi" role="UNKNOWN">φ</XMTok>
              </XMath>
            </Math> = 14 in this study. Subsequently, we trained the AFE on the augmented ME dataset to improve its ability to capture more advanced and abstract ME features.
<!--  %ME␣数据集的样本平衡过程显示在表␣\ref{table：equalization}␣中。很明显，原始␣ME␣数据集存在严重的类不平衡。例如，在␣CASME␣II␣数据集中，“negative”␣类包含␣88␣个样本，而␣’’surprise’’␣类只有␣25␣个样本。同样，在␣SAMM␣数据集中，“negative”␣类有␣92␣个样本，而␣’’disgust’’␣类只包含␣15␣个样本。通过应用不同的放大因子组合，我们实现了三个关键目标：（i）␣显着增加样本量（从␣442␣个样本增加到␣4022␣个样本），（ii）␣有效缓解了类别不平衡问题，以及␣（iii）␣提高了面部肌肉运动的可见性。实验结果表明，过度放大（尤其是超过␣15␣倍）会导致严重的面部变形和图像质量降低。因此，在本研究中，我们设置放大倍数从1开始，最佳最大放大倍数设置为␣$\varphi$␣=␣14。随后，我们在增强的␣ME␣数据集上训练了␣AFE，以提高其捕获更高级和抽象的␣ME␣特征的能力。 --></p>
        </para>
      </subsubsection>
    </subsection>
    <subsection inlist="toc" labels="LABEL:sec:_variants" xml:id="S3.SS4">
      <tags>
        <tag>3.4</tag>
        <tag role="autoref">subsection 3.4</tag>
        <tag role="refnum">3.4</tag>
        <tag role="typerefnum">§3.4</tag>
      </tags>
      <title><tag close=" ">3.4</tag><text font="italic">Model variants: MPFNet-P and MPFNet-C</text></title>
      <para xml:id="S3.SS4.p1">
        <p>To provide a more comprehensive explanation of the operational mechanism of MPFNet, this section elaborates on two variants, MPFNet-P and MPFNet-C, to clearly introduce their network architecture and workflow.
<!--  %为了更全面地解释MPFNet的运行机制，本节详细阐述了MPFNet-P和MPFNet-C两个变体，以清楚地介绍它们的网络架构和工作流程。 --></p>
      </para>
      <subsubsection inlist="toc" xml:id="S3.SS4.SSS1">
        <tags>
          <tag>3.4.1</tag>
          <tag role="autoref">subsubsection 3.4.1</tag>
          <tag role="refnum">3.4.1</tag>
          <tag role="typerefnum">§3.4.1</tag>
        </tags>
        <title><tag close=" ">3.4.1</tag>MPFNet-P</title>
        <para xml:id="S3.SS4.SSS1.p1">
          <p>After training the GFE and AFE, we integrate the encoders with complementary prior knowledge in parallel within a metric-based meta-learning framework, forming MPFNet-P. The architectural details of MPFNet-P are presented in Fig. <ref labelref="LABEL:figs:MPFNet-P"/>. This model adopts a parallel architecture design, where the GFE and AFE operate independently, extracting general and advanced features from the input data, respectively. This design enables the parallel processing of multi-perspective feature representations. The overall workflow of MPFNet-P is as follows:
<!--  %在完成␣GFE␣和␣AFE␣的训练后，我们将具有互补先验知识的编码器以并联的形式集成到基于度量的元学习框架中，从而构建␣MPFNet-P。MPFNet-P␣的架构细节如图␣\ref{fig:MPFNet-P}␣所示。该模型采用并行架构设计，其中␣GFE␣和␣AFE␣独立运行，分别从输入数据中提取一般特征和高级特征。这种设计可以并行处理多视角特征表示。 --></p>
        </para>
        <para xml:id="S3.SS4.SSS1.p2">
          <p>(i) Data preprocessing module: The input to this module consists of image pairs formed by the onset and apex frames of ME samples. A frame interpolation method is applied to generate ME frame sequences of normalized length. Subsequently, optical flow and frame difference-based fusion features are computed to enhance the representation of spatiotemporal information. The position of this module within the workflow is illustrated in Fig. <ref labelref="LABEL:figs:MPFNet-P"/>(a), with further details provided in Section <ref labelref="LABEL:sec:_Data_preprocessing"/>.
(ii) Prior learning based on a triplet network (PLTN): This module enables the model to learn generic feature extraction capabilities by constructing triplet ME samples and assessing their similarity. The acquired prior knowledge is stored in frozen parameters and subsequently used to initialize the GFE, as shown in Fig. <ref labelref="LABEL:figs:MPFNet-P"/>(b).
(iii) Prior learning based on sample-balanced motion-amplified MEs (PLSM): This module involves training a CA-I3D model on a sample-balanced and augmented ME dataset to improve its ability to capture advanced ME features. Upon completion of training, the learned parameters are frozen and transferred to the AFE, as depicted in Fig. <ref labelref="LABEL:figs:MPFNet-P"/>(c). Additional details regarding these two prior learning processes can be found in Section <ref labelref="LABEL:sec:_Pre-training"/>.
<!--  %(1)␣数据预处理模块：此模块的输入包括␣ME␣样本的起始帧和顶点帧形成的图像对。采用帧插值方法生成归一化长度的␣ME␣帧序列。随后，计算光流和基于帧差异的融合特征以增强时空信息的表示。此模块在工作流程中的位置如图␣\ref{fig:MPFNet-P}(a)␣所示，B␣节将提供更多详细信息。␣(2)␣基于三重网络␣(PLTN)␣的先验学习：此模块通过构建三重␣ME␣样本并评估其相似性，使模型能够学习通用特征提取能力。获得的先验知识存储在冻结参数中，随后用于初始化␣GFE，如图␣\ref{fig:MPFNet-P}(b)␣所示。␣(3)␣基于样本平衡运动放大␣ME␣(PLSM)␣的先验学习：此模块涉及在样本平衡和增强的␣ME␣数据集上训练␣CA-I3D␣模型，以提高其捕获高级␣ME␣特征的能力。训练完成后，学习到的参数被冻结并传输到␣AFE，如图␣\ref{fig:MPFNet-P}(c)␣所示。有关这两个先验学习过程的更多详细信息，请参阅第␣C␣节。 --></p>
        </para>
        <para xml:id="S3.SS4.SSS1.p3">
          <p>(iv) Multi-prior fusion meta-learning module: Within the metric-based meta-learning framework, this module integrates the complementary prior knowledge of GFE and AFE to enhance the model’s MER capability, as illustrated in Fig. <ref labelref="LABEL:figs:MPFNet-P"/>(d).
<!--  %****␣manuscript.tex␣Line␣875␣**** 
     %(4)␣多先验融合元学习模块：在基于度量的元学习框架内，该模块整合了␣GFE␣和␣AFE␣互补的先验知识，以增强模型的␣MER␣能力，如图␣\ref{fig:MPFNet-P}(d)␣所示。-->In this module, we first encode the deep feature vectors of the samples in both the support set <Math mode="inline" tex="S" text="S" xml:id="S3.SS4.SSS1.p3.m1">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">S</XMTok>
              </XMath>
            </Math> and the query set <Math mode="inline" tex="Q" text="Q" xml:id="S3.SS4.SSS1.p3.m2">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">Q</XMTok>
              </XMath>
            </Math>. Then, we average the feature vectors of samples within the same class in the support set <Math mode="inline" tex="S" text="S" xml:id="S3.SS4.SSS1.p3.m3">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">S</XMTok>
              </XMath>
            </Math> to obtain the mean feature vector, which serves as the class centroid in the embedding space. The specific calculation method is as follows:

<!--  %在这个模块中，我们首先对支持集␣$S$␣和查询集␣$Q$␣中的样本的深度特征向量进行编码。然后对支持集␣$S$␣中同一类内的样本的特征向量取平均，得到平均特征向量，作为␣embedding␣空间中的类质心。具体计算方法如下： --></p>
        </para>
        <para xml:id="S3.SS4.SSS1.p4">
          <equation labels="LABEL:wc" xml:id="S3.E10">
            <tags>
              <tag>(10)</tag>
              <tag role="autoref">Equation 10</tag>
              <tag role="refnum">10</tag>
            </tags>
            <Math mode="display" tex="w_{c}=\frac{1}{|S_{c}|}\sum f_{\theta}(x_{s}),x_{s}\in S_{c}," text="formulae@(w _ c = (1 / absolute-value@(S _ c)) * sum@(f _ theta * x _ s), x _ s element-of S _ c)" xml:id="S3.E10.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S3.E10.m1.2"/>
                  <XMWrap>
                    <XMDual xml:id="S3.E10.m1.2">
                      <XMApp>
                        <XMTok meaning="formulae"/>
                        <XMRef idref="S3.E10.m1.2.1"/>
                        <XMRef idref="S3.E10.m1.2.2"/>
                      </XMApp>
                      <XMWrap>
                        <XMApp xml:id="S3.E10.m1.2.1">
                          <XMTok meaning="equals" role="RELOP">=</XMTok>
                          <XMApp>
                            <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                            <XMTok font="italic" role="UNKNOWN">w</XMTok>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                          </XMApp>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMApp>
                              <XMTok mathstyle="display" meaning="divide" role="FRACOP"/>
                              <XMTok meaning="1" role="NUMBER">1</XMTok>
                              <XMDual>
                                <XMApp>
                                  <XMTok meaning="absolute-value"/>
                                  <XMRef idref="S3.E10.m1.1"/>
                                </XMApp>
                                <XMWrap>
                                  <XMTok role="VERTBAR" stretchy="false">|</XMTok>
                                  <XMApp xml:id="S3.E10.m1.1">
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                    <XMTok font="italic" role="UNKNOWN">S</XMTok>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                  </XMApp>
                                  <XMTok role="VERTBAR" stretchy="false">|</XMTok>
                                </XMWrap>
                              </XMDual>
                            </XMApp>
                            <XMApp>
                              <XMTok mathstyle="display" meaning="sum" role="SUMOP" scriptpos="mid">∑</XMTok>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMApp>
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                  <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                  <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                </XMApp>
                                <XMDual>
                                  <XMRef idref="S3.E10.m1.2.1.1"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E10.m1.2.1.1">
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                            </XMApp>
                          </XMApp>
                        </XMApp>
                        <XMTok role="PUNCT">,</XMTok>
                        <XMApp xml:id="S3.E10.m1.2.2">
                          <XMTok meaning="element-of" name="in" role="RELOP">∈</XMTok>
                          <XMApp>
                            <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                            <XMTok font="italic" role="UNKNOWN">x</XMTok>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                          </XMApp>
                          <XMApp>
                            <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                            <XMTok font="italic" role="UNKNOWN">S</XMTok>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                          </XMApp>
                        </XMApp>
                      </XMWrap>
                    </XMDual>
                    <XMTok role="PUNCT">,</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
          <p>where <Math mode="inline" tex="f_{\theta}" text="f _ theta" xml:id="S3.SS4.SSS1.p4.m1">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">f</XMTok>
                  <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                </XMApp>
              </XMath>
            </Math> represents the feature encoder, <Math mode="inline" tex="f_{\theta}(x_{s})" text="f _ theta * x _ s" xml:id="S3.SS4.SSS1.p4.m2">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">f</XMTok>
                    <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                  </XMApp>
                  <XMDual>
                    <XMRef idref="S3.SS4.SSS1.p4.m2.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS4.SSS1.p4.m2.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> denotes the deep feature vector of a data sample <Math mode="inline" tex="x_{s}" text="x _ s" xml:id="S3.SS4.SSS1.p4.m3">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                </XMApp>
              </XMath>
            </Math> in the support set <Math mode="inline" tex="S" text="S" xml:id="S3.SS4.SSS1.p4.m4">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">S</XMTok>
              </XMath>
            </Math>, <Math mode="inline" tex="S_{c}" text="S _ c" xml:id="S3.SS4.SSS1.p4.m5">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">S</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                </XMApp>
              </XMath>
            </Math> is the sample cluster of the <Math mode="inline" tex="c" text="c" xml:id="S3.SS4.SSS1.p4.m6">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">c</XMTok>
              </XMath>
            </Math>-<Math mode="inline" tex="th" text="t * h" xml:id="S3.SS4.SSS1.p4.m7">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">t</XMTok>
                  <XMTok font="italic" role="UNKNOWN">h</XMTok>
                </XMApp>
              </XMath>
            </Math> class, and <Math mode="inline" tex="w_{c}" text="w _ c" xml:id="S3.SS4.SSS1.p4.m8">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">w</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                </XMApp>
              </XMath>
            </Math> is the centroid of the <Math mode="inline" tex="c" text="c" xml:id="S3.SS4.SSS1.p4.m9">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">c</XMTok>
              </XMath>
            </Math>-<Math mode="inline" tex="th" text="t * h" xml:id="S3.SS4.SSS1.p4.m10">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">t</XMTok>
                  <XMTok font="italic" role="UNKNOWN">h</XMTok>
                </XMApp>
              </XMath>
            </Math> class.
<!--  %其中$f_{\theta␣}$表示特征编码器，$f_{\theta}(x_s)$表示支持集$S$中某个数据样本$x_s$的深度特征向量，$S_{c}$为第$c$类的样本聚类，$w_{c}$为第$c$类的质心。 --></p>
        </para>
        <figure inlist="lof" labels="LABEL:figs:MPFNet-P" placement="t" xml:id="S3.F6">
          <tags>
            <tag>Fig. 6</tag>
            <tag role="autoref">Figure 6</tag>
            <tag role="refnum">6</tag>
            <tag role="typerefnum">Fig. 6</tag>
          </tags>
          <graphics candidates="MPFNet-P.pdf" class="ltx_centering" graphic="MPFNet-P.pdf" options="width=368.577pt" xml:id="S3.F6.g1"/>
          <toccaption class="ltx_centering"><tag close=" ">6</tag>The structure of the MPFNet-P, which has a dual-stream architecture and includes five components. One data stream encodes the generic features of MEs and is represented by blue lines, while the other encodes advanced features and is represented by yellow lines. These dual streams are fused using a weighted-sum model fusion method for the final classification of MEs.</toccaption>
          <caption class="ltx_centering"><tag close=": ">Fig. 6</tag>The structure of the MPFNet-P, which has a dual-stream architecture and includes five components. One data stream encodes the generic features of MEs and is represented by blue lines, while the other encodes advanced features and is represented by yellow lines. These dual streams are fused using a weighted-sum model fusion method for the final classification of MEs.</caption>
        </figure>
        <para xml:id="S3.SS4.SSS1.p5">
          <p>Then, we employ a standard metric-based meta-learning process to compute the cosine similarity distance between each sample <Math mode="inline" tex="f_{\theta}(x_{q})" text="f _ theta * x _ q" xml:id="S3.SS4.SSS1.p5.m1">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">f</XMTok>
                    <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                  </XMApp>
                  <XMDual>
                    <XMRef idref="S3.SS4.SSS1.p5.m1.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS4.SSS1.p5.m1.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> in the query set <Math mode="inline" tex="Q" text="Q" xml:id="S3.SS4.SSS1.p5.m2">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">Q</XMTok>
              </XMath>
            </Math> and the centroid of each class <Math mode="inline" tex="w_{c}" text="w _ c" xml:id="S3.SS4.SSS1.p5.m3">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">w</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                </XMApp>
              </XMath>
            </Math> in the support set <Math mode="inline" tex="S" text="S" xml:id="S3.SS4.SSS1.p5.m4">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">S</XMTok>
              </XMath>
            </Math> within the embedding space. The calculation is as follows:
<!--  %然后，我们采用一个标准的基于度量的元学习过程来计算查询集$Q$中每个样本$f_{\theta␣}（x_q）$与嵌入空间内支持集$S$中每个类的质心$w_c$之间的余弦相似度距离。计算如下： --></p>
        </para>
        <para xml:id="S3.SS4.SSS1.p6">
          <equationgroup class="ltx_eqn_align" xml:id="Sx1.EGx1">
            <equation xml:id="S3.E11">
              <tags>
                <tag>(11)</tag>
                <tag role="autoref">Equation 11</tag>
                <tag role="refnum">11</tag>
              </tags>
              <MathFork>
                <Math tex="\displaystyle d_{GFE}=similarity(f_{\theta}^{G}(x_{q}),w_{c}^{G})=\frac{f_{%&#10;\theta}^{G}(x_{q})\cdot w_{c}^{G}}{\|f_{\theta}^{G}(x_{q})\|\|w_{c}^{G}\|}," text="d _ (G * F * E) = s * i * m * i * l * a * r * i * t * y * open-interval@((f _ theta) ^ G * x _ q, (w _ c) ^ G) = (((f _ theta) ^ G * x _ q) cdot (w _ c) ^ G) / (norm@((f _ theta) ^ G * x _ q) * norm@((w _ c) ^ G))" xml:id="S3.E11.m2">
                  <XMath>
                    <XMDual>
                      <XMRef idref="S3.E11.m2.4"/>
                      <XMWrap>
                        <XMApp xml:id="S3.E11.m2.4">
                          <XMTok meaning="multirelation"/>
                          <XMApp>
                            <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                            <XMTok font="italic" role="UNKNOWN">d</XMTok>
                            <XMApp>
                              <XMTok meaning="times" role="MULOP">⁢</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">F</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">E</XMTok>
                            </XMApp>
                          </XMApp>
                          <XMTok meaning="equals" role="RELOP">=</XMTok>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok font="italic" role="UNKNOWN">s</XMTok>
                            <XMTok font="italic" role="UNKNOWN">i</XMTok>
                            <XMTok font="italic" role="UNKNOWN">m</XMTok>
                            <XMTok font="italic" role="UNKNOWN">i</XMTok>
                            <XMTok font="italic" role="UNKNOWN">l</XMTok>
                            <XMTok font="italic" role="UNKNOWN">a</XMTok>
                            <XMTok font="italic" role="UNKNOWN">r</XMTok>
                            <XMTok font="italic" role="UNKNOWN">i</XMTok>
                            <XMTok font="italic" role="UNKNOWN">t</XMTok>
                            <XMTok font="italic" role="UNKNOWN">y</XMTok>
                            <XMDual>
                              <XMApp>
                                <XMTok meaning="open-interval"/>
                                <XMRef idref="S3.E11.m2.4.1"/>
                                <XMRef idref="S3.E11.m2.4.2"/>
                              </XMApp>
                              <XMWrap>
                                <XMTok role="OPEN" stretchy="false">(</XMTok>
                                <XMApp xml:id="S3.E11.m2.4.1">
                                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                  <XMApp>
                                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                    <XMApp>
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                      <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                      <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                    </XMApp>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                  </XMApp>
                                  <XMDual>
                                    <XMRef idref="S3.E11.m2.4.1.1"/>
                                    <XMWrap>
                                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                                      <XMApp xml:id="S3.E11.m2.4.1.1">
                                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                      </XMApp>
                                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                    </XMWrap>
                                  </XMDual>
                                </XMApp>
                                <XMTok role="PUNCT">,</XMTok>
                                <XMApp xml:id="S3.E11.m2.4.2">
                                  <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                  <XMApp>
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                    <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                  </XMApp>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                </XMApp>
                                <XMTok role="CLOSE" stretchy="false">)</XMTok>
                              </XMWrap>
                            </XMDual>
                          </XMApp>
                          <XMTok meaning="equals" role="RELOP">=</XMTok>
                          <XMApp>
                            <XMTok mathstyle="display" meaning="divide" role="FRACOP"/>
                            <XMApp>
                              <XMTok name="cdot" role="MULOP">⋅</XMTok>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMApp>
                                  <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                  <XMApp>
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                    <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                    <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                  </XMApp>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                </XMApp>
                                <XMDual>
                                  <XMRef idref="S3.E11.m2.1"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E11.m2.1">
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                              <XMApp>
                                <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                <XMApp>
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                  <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                </XMApp>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                              </XMApp>
                            </XMApp>
                            <XMApp>
                              <XMTok meaning="times" role="MULOP">⁢</XMTok>
                              <XMDual>
                                <XMApp>
                                  <XMTok meaning="norm"/>
                                  <XMRef idref="S3.E11.m2.2"/>
                                </XMApp>
                                <XMWrap>
                                  <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                  <XMApp xml:id="S3.E11.m2.2">
                                    <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                    <XMApp>
                                      <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                      <XMApp>
                                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                        <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                        <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                      </XMApp>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                    </XMApp>
                                    <XMDual>
                                      <XMRef idref="S3.E11.m2.2.1"/>
                                      <XMWrap>
                                        <XMTok role="OPEN" stretchy="false">(</XMTok>
                                        <XMApp xml:id="S3.E11.m2.2.1">
                                          <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                          <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                        </XMApp>
                                        <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                      </XMWrap>
                                    </XMDual>
                                  </XMApp>
                                  <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                </XMWrap>
                              </XMDual>
                              <XMDual>
                                <XMApp>
                                  <XMTok meaning="norm"/>
                                  <XMRef idref="S3.E11.m2.3"/>
                                </XMApp>
                                <XMWrap>
                                  <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                  <XMApp xml:id="S3.E11.m2.3">
                                    <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                    <XMApp>
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                      <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                    </XMApp>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                  </XMApp>
                                  <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                </XMWrap>
                              </XMDual>
                            </XMApp>
                          </XMApp>
                        </XMApp>
                        <XMTok role="PUNCT">,</XMTok>
                      </XMWrap>
                    </XMDual>
                  </XMath>
                </Math>
                <MathBranch>
                  <td align="left"><Math mode="inline" tex="\displaystyle d_{GFE}=similarity(f_{\theta}^{G}(x_{q}),w_{c}^{G})=\frac{f_{%&#10;\theta}^{G}(x_{q})\cdot w_{c}^{G}}{\|f_{\theta}^{G}(x_{q})\|\|w_{c}^{G}\|}," text="d _ (G * F * E) = s * i * m * i * l * a * r * i * t * y * open-interval@((f _ theta) ^ G * x _ q, (w _ c) ^ G) = (((f _ theta) ^ G * x _ q) cdot (w _ c) ^ G) / (norm@((f _ theta) ^ G * x _ q) * norm@((w _ c) ^ G))" xml:id="S3.E11.m1">
                      <XMath>
                        <XMDual>
                          <XMRef idref="S3.E11.m1.4"/>
                          <XMWrap>
                            <XMApp xml:id="S3.E11.m1.4">
                              <XMTok meaning="multirelation"/>
                              <XMApp>
                                <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                <XMTok font="italic" role="UNKNOWN">d</XMTok>
                                <XMApp>
                                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">F</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">E</XMTok>
                                </XMApp>
                              </XMApp>
                              <XMTok meaning="equals" role="RELOP">=</XMTok>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMTok font="italic" role="UNKNOWN">s</XMTok>
                                <XMTok font="italic" role="UNKNOWN">i</XMTok>
                                <XMTok font="italic" role="UNKNOWN">m</XMTok>
                                <XMTok font="italic" role="UNKNOWN">i</XMTok>
                                <XMTok font="italic" role="UNKNOWN">l</XMTok>
                                <XMTok font="italic" role="UNKNOWN">a</XMTok>
                                <XMTok font="italic" role="UNKNOWN">r</XMTok>
                                <XMTok font="italic" role="UNKNOWN">i</XMTok>
                                <XMTok font="italic" role="UNKNOWN">t</XMTok>
                                <XMTok font="italic" role="UNKNOWN">y</XMTok>
                                <XMDual>
                                  <XMApp>
                                    <XMTok meaning="open-interval"/>
                                    <XMRef idref="S3.E11.m1.4.1"/>
                                    <XMRef idref="S3.E11.m1.4.2"/>
                                  </XMApp>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E11.m1.4.1">
                                      <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                      <XMApp>
                                        <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                        <XMApp>
                                          <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                          <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                          <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                        </XMApp>
                                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                      </XMApp>
                                      <XMDual>
                                        <XMRef idref="S3.E11.m1.4.1.1"/>
                                        <XMWrap>
                                          <XMTok role="OPEN" stretchy="false">(</XMTok>
                                          <XMApp xml:id="S3.E11.m1.4.1.1">
                                            <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                            <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                          </XMApp>
                                          <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                        </XMWrap>
                                      </XMDual>
                                    </XMApp>
                                    <XMTok role="PUNCT">,</XMTok>
                                    <XMApp xml:id="S3.E11.m1.4.2">
                                      <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                      <XMApp>
                                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                        <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                      </XMApp>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                              <XMTok meaning="equals" role="RELOP">=</XMTok>
                              <XMApp>
                                <XMTok mathstyle="display" meaning="divide" role="FRACOP"/>
                                <XMApp>
                                  <XMTok name="cdot" role="MULOP">⋅</XMTok>
                                  <XMApp>
                                    <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                    <XMApp>
                                      <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                      <XMApp>
                                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                        <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                        <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                      </XMApp>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                    </XMApp>
                                    <XMDual>
                                      <XMRef idref="S3.E11.m1.1"/>
                                      <XMWrap>
                                        <XMTok role="OPEN" stretchy="false">(</XMTok>
                                        <XMApp xml:id="S3.E11.m1.1">
                                          <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                          <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                        </XMApp>
                                        <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                      </XMWrap>
                                    </XMDual>
                                  </XMApp>
                                  <XMApp>
                                    <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                    <XMApp>
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                      <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                    </XMApp>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                  </XMApp>
                                </XMApp>
                                <XMApp>
                                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                  <XMDual>
                                    <XMApp>
                                      <XMTok meaning="norm"/>
                                      <XMRef idref="S3.E11.m1.2"/>
                                    </XMApp>
                                    <XMWrap>
                                      <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                      <XMApp xml:id="S3.E11.m1.2">
                                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                        <XMApp>
                                          <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                          <XMApp>
                                            <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                            <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                            <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                          </XMApp>
                                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                        </XMApp>
                                        <XMDual>
                                          <XMRef idref="S3.E11.m1.2.1"/>
                                          <XMWrap>
                                            <XMTok role="OPEN" stretchy="false">(</XMTok>
                                            <XMApp xml:id="S3.E11.m1.2.1">
                                              <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                              <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                            </XMApp>
                                            <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                          </XMWrap>
                                        </XMDual>
                                      </XMApp>
                                      <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                    </XMWrap>
                                  </XMDual>
                                  <XMDual>
                                    <XMApp>
                                      <XMTok meaning="norm"/>
                                      <XMRef idref="S3.E11.m1.3"/>
                                    </XMApp>
                                    <XMWrap>
                                      <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                      <XMApp xml:id="S3.E11.m1.3">
                                        <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                        <XMApp>
                                          <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                          <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                        </XMApp>
                                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                      </XMApp>
                                      <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                    </XMWrap>
                                  </XMDual>
                                </XMApp>
                              </XMApp>
                            </XMApp>
                            <XMTok role="PUNCT">,</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMath>
                    </Math></td>
                </MathBranch>
              </MathFork>
            </equation>
            <equation xml:id="S3.E12">
              <tags>
                <tag>(12)</tag>
                <tag role="autoref">Equation 12</tag>
                <tag role="refnum">12</tag>
              </tags>
              <MathFork>
                <Math tex="\displaystyle d_{AFE}=similarity(f_{\theta}^{A}(x_{q}),w_{c}^{A})=\frac{f_{%&#10;\theta}^{A}(x_{q})\cdot w_{c}^{A}}{\|f_{\theta}^{A}(x_{q})\|\|w_{c}^{A}\|}," text="d _ (A * F * E) = s * i * m * i * l * a * r * i * t * y * open-interval@((f _ theta) ^ A * x _ q, (w _ c) ^ A) = (((f _ theta) ^ A * x _ q) cdot (w _ c) ^ A) / (norm@((f _ theta) ^ A * x _ q) * norm@((w _ c) ^ A))" xml:id="S3.E12.m2">
                  <XMath>
                    <XMDual>
                      <XMRef idref="S3.E12.m2.4"/>
                      <XMWrap>
                        <XMApp xml:id="S3.E12.m2.4">
                          <XMTok meaning="multirelation"/>
                          <XMApp>
                            <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                            <XMTok font="italic" role="UNKNOWN">d</XMTok>
                            <XMApp>
                              <XMTok meaning="times" role="MULOP">⁢</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">F</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">E</XMTok>
                            </XMApp>
                          </XMApp>
                          <XMTok meaning="equals" role="RELOP">=</XMTok>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok font="italic" role="UNKNOWN">s</XMTok>
                            <XMTok font="italic" role="UNKNOWN">i</XMTok>
                            <XMTok font="italic" role="UNKNOWN">m</XMTok>
                            <XMTok font="italic" role="UNKNOWN">i</XMTok>
                            <XMTok font="italic" role="UNKNOWN">l</XMTok>
                            <XMTok font="italic" role="UNKNOWN">a</XMTok>
                            <XMTok font="italic" role="UNKNOWN">r</XMTok>
                            <XMTok font="italic" role="UNKNOWN">i</XMTok>
                            <XMTok font="italic" role="UNKNOWN">t</XMTok>
                            <XMTok font="italic" role="UNKNOWN">y</XMTok>
                            <XMDual>
                              <XMApp>
                                <XMTok meaning="open-interval"/>
                                <XMRef idref="S3.E12.m2.4.1"/>
                                <XMRef idref="S3.E12.m2.4.2"/>
                              </XMApp>
                              <XMWrap>
                                <XMTok role="OPEN" stretchy="false">(</XMTok>
                                <XMApp xml:id="S3.E12.m2.4.1">
                                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                  <XMApp>
                                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                    <XMApp>
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                      <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                      <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                    </XMApp>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                  </XMApp>
                                  <XMDual>
                                    <XMRef idref="S3.E12.m2.4.1.1"/>
                                    <XMWrap>
                                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                                      <XMApp xml:id="S3.E12.m2.4.1.1">
                                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                      </XMApp>
                                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                    </XMWrap>
                                  </XMDual>
                                </XMApp>
                                <XMTok role="PUNCT">,</XMTok>
                                <XMApp xml:id="S3.E12.m2.4.2">
                                  <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                  <XMApp>
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                    <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                  </XMApp>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                </XMApp>
                                <XMTok role="CLOSE" stretchy="false">)</XMTok>
                              </XMWrap>
                            </XMDual>
                          </XMApp>
                          <XMTok meaning="equals" role="RELOP">=</XMTok>
                          <XMApp>
                            <XMTok mathstyle="display" meaning="divide" role="FRACOP"/>
                            <XMApp>
                              <XMTok name="cdot" role="MULOP">⋅</XMTok>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMApp>
                                  <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                  <XMApp>
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                    <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                    <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                  </XMApp>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                </XMApp>
                                <XMDual>
                                  <XMRef idref="S3.E12.m2.1"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E12.m2.1">
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                              <XMApp>
                                <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                <XMApp>
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                  <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                </XMApp>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                              </XMApp>
                            </XMApp>
                            <XMApp>
                              <XMTok meaning="times" role="MULOP">⁢</XMTok>
                              <XMDual>
                                <XMApp>
                                  <XMTok meaning="norm"/>
                                  <XMRef idref="S3.E12.m2.2"/>
                                </XMApp>
                                <XMWrap>
                                  <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                  <XMApp xml:id="S3.E12.m2.2">
                                    <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                    <XMApp>
                                      <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                      <XMApp>
                                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                        <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                        <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                      </XMApp>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                    </XMApp>
                                    <XMDual>
                                      <XMRef idref="S3.E12.m2.2.1"/>
                                      <XMWrap>
                                        <XMTok role="OPEN" stretchy="false">(</XMTok>
                                        <XMApp xml:id="S3.E12.m2.2.1">
                                          <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                          <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                        </XMApp>
                                        <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                      </XMWrap>
                                    </XMDual>
                                  </XMApp>
                                  <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                </XMWrap>
                              </XMDual>
                              <XMDual>
                                <XMApp>
                                  <XMTok meaning="norm"/>
                                  <XMRef idref="S3.E12.m2.3"/>
                                </XMApp>
                                <XMWrap>
                                  <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                  <XMApp xml:id="S3.E12.m2.3">
                                    <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                    <XMApp>
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                      <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                    </XMApp>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                  </XMApp>
                                  <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                </XMWrap>
                              </XMDual>
                            </XMApp>
                          </XMApp>
                        </XMApp>
                        <XMTok role="PUNCT">,</XMTok>
                      </XMWrap>
                    </XMDual>
                  </XMath>
                </Math>
                <MathBranch>
                  <td align="left"><Math mode="inline" tex="\displaystyle d_{AFE}=similarity(f_{\theta}^{A}(x_{q}),w_{c}^{A})=\frac{f_{%&#10;\theta}^{A}(x_{q})\cdot w_{c}^{A}}{\|f_{\theta}^{A}(x_{q})\|\|w_{c}^{A}\|}," text="d _ (A * F * E) = s * i * m * i * l * a * r * i * t * y * open-interval@((f _ theta) ^ A * x _ q, (w _ c) ^ A) = (((f _ theta) ^ A * x _ q) cdot (w _ c) ^ A) / (norm@((f _ theta) ^ A * x _ q) * norm@((w _ c) ^ A))" xml:id="S3.E12.m1">
                      <XMath>
                        <XMDual>
                          <XMRef idref="S3.E12.m1.4"/>
                          <XMWrap>
                            <XMApp xml:id="S3.E12.m1.4">
                              <XMTok meaning="multirelation"/>
                              <XMApp>
                                <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                <XMTok font="italic" role="UNKNOWN">d</XMTok>
                                <XMApp>
                                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">F</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">E</XMTok>
                                </XMApp>
                              </XMApp>
                              <XMTok meaning="equals" role="RELOP">=</XMTok>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMTok font="italic" role="UNKNOWN">s</XMTok>
                                <XMTok font="italic" role="UNKNOWN">i</XMTok>
                                <XMTok font="italic" role="UNKNOWN">m</XMTok>
                                <XMTok font="italic" role="UNKNOWN">i</XMTok>
                                <XMTok font="italic" role="UNKNOWN">l</XMTok>
                                <XMTok font="italic" role="UNKNOWN">a</XMTok>
                                <XMTok font="italic" role="UNKNOWN">r</XMTok>
                                <XMTok font="italic" role="UNKNOWN">i</XMTok>
                                <XMTok font="italic" role="UNKNOWN">t</XMTok>
                                <XMTok font="italic" role="UNKNOWN">y</XMTok>
                                <XMDual>
                                  <XMApp>
                                    <XMTok meaning="open-interval"/>
                                    <XMRef idref="S3.E12.m1.4.1"/>
                                    <XMRef idref="S3.E12.m1.4.2"/>
                                  </XMApp>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E12.m1.4.1">
                                      <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                      <XMApp>
                                        <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                        <XMApp>
                                          <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                          <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                          <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                        </XMApp>
                                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                      </XMApp>
                                      <XMDual>
                                        <XMRef idref="S3.E12.m1.4.1.1"/>
                                        <XMWrap>
                                          <XMTok role="OPEN" stretchy="false">(</XMTok>
                                          <XMApp xml:id="S3.E12.m1.4.1.1">
                                            <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                            <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                          </XMApp>
                                          <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                        </XMWrap>
                                      </XMDual>
                                    </XMApp>
                                    <XMTok role="PUNCT">,</XMTok>
                                    <XMApp xml:id="S3.E12.m1.4.2">
                                      <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                      <XMApp>
                                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                        <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                      </XMApp>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                              <XMTok meaning="equals" role="RELOP">=</XMTok>
                              <XMApp>
                                <XMTok mathstyle="display" meaning="divide" role="FRACOP"/>
                                <XMApp>
                                  <XMTok name="cdot" role="MULOP">⋅</XMTok>
                                  <XMApp>
                                    <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                    <XMApp>
                                      <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                      <XMApp>
                                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                        <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                        <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                      </XMApp>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                    </XMApp>
                                    <XMDual>
                                      <XMRef idref="S3.E12.m1.1"/>
                                      <XMWrap>
                                        <XMTok role="OPEN" stretchy="false">(</XMTok>
                                        <XMApp xml:id="S3.E12.m1.1">
                                          <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                          <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                        </XMApp>
                                        <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                      </XMWrap>
                                    </XMDual>
                                  </XMApp>
                                  <XMApp>
                                    <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                    <XMApp>
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                      <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                    </XMApp>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                  </XMApp>
                                </XMApp>
                                <XMApp>
                                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                  <XMDual>
                                    <XMApp>
                                      <XMTok meaning="norm"/>
                                      <XMRef idref="S3.E12.m1.2"/>
                                    </XMApp>
                                    <XMWrap>
                                      <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                      <XMApp xml:id="S3.E12.m1.2">
                                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                        <XMApp>
                                          <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                          <XMApp>
                                            <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                            <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                            <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                          </XMApp>
                                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                        </XMApp>
                                        <XMDual>
                                          <XMRef idref="S3.E12.m1.2.1"/>
                                          <XMWrap>
                                            <XMTok role="OPEN" stretchy="false">(</XMTok>
                                            <XMApp xml:id="S3.E12.m1.2.1">
                                              <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                              <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                            </XMApp>
                                            <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                          </XMWrap>
                                        </XMDual>
                                      </XMApp>
                                      <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                    </XMWrap>
                                  </XMDual>
                                  <XMDual>
                                    <XMApp>
                                      <XMTok meaning="norm"/>
                                      <XMRef idref="S3.E12.m1.3"/>
                                    </XMApp>
                                    <XMWrap>
                                      <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                      <XMApp xml:id="S3.E12.m1.3">
                                        <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                        <XMApp>
                                          <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                          <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                        </XMApp>
                                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                      </XMApp>
                                      <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                                    </XMWrap>
                                  </XMDual>
                                </XMApp>
                              </XMApp>
                            </XMApp>
                            <XMTok role="PUNCT">,</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMath>
                    </Math></td>
                </MathBranch>
              </MathFork>
            </equation>
          </equationgroup>
          <p>where <Math mode="inline" tex="w_{c}^{G}" text="(w _ c) ^ G" xml:id="S3.SS4.SSS1.p6.m1">
              <XMath>
                <XMApp>
                  <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">w</XMTok>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                  </XMApp>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                </XMApp>
              </XMath>
            </Math> and <Math mode="inline" tex="w_{c}^{A}" text="(w _ c) ^ A" xml:id="S3.SS4.SSS1.p6.m2">
              <XMath>
                <XMApp>
                  <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">w</XMTok>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                  </XMApp>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                </XMApp>
              </XMath>
            </Math> represent the centroids of each class in the support set <Math mode="inline" tex="S" text="S" xml:id="S3.SS4.SSS1.p6.m3">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">S</XMTok>
              </XMath>
            </Math>, computed from the deep features extracted using the GFE and the AFE, respectively. Similarly, <Math mode="inline" tex="f_{\theta}^{G}(x_{q})" text="(f _ theta) ^ G * x _ q" xml:id="S3.SS4.SSS1.p6.m4">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMApp>
                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                    <XMApp>
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">f</XMTok>
                      <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                    </XMApp>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                  </XMApp>
                  <XMDual>
                    <XMRef idref="S3.SS4.SSS1.p6.m4.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS4.SSS1.p6.m4.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> and <Math mode="inline" tex="f_{\theta}^{A}(x_{q})" text="(f _ theta) ^ A * x _ q" xml:id="S3.SS4.SSS1.p6.m5">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMApp>
                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                    <XMApp>
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">f</XMTok>
                      <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                    </XMApp>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                  </XMApp>
                  <XMDual>
                    <XMRef idref="S3.SS4.SSS1.p6.m5.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS4.SSS1.p6.m5.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> denote the deep feature vectors of the samples in the query set <Math mode="inline" tex="Q" text="Q" xml:id="S3.SS4.SSS1.p6.m6">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">Q</XMTok>
              </XMath>
            </Math>, encoded by the GFE and AFE, respectively. <Math mode="inline" tex="d_{GFE}^{i}" text="(d _ (G * F * E)) ^ i" xml:id="S3.SS4.SSS1.p6.m7">
              <XMath>
                <XMApp>
                  <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">d</XMTok>
                    <XMApp>
                      <XMTok meaning="times" role="MULOP">⁢</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">F</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">E</XMTok>
                    </XMApp>
                  </XMApp>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                </XMApp>
              </XMath>
            </Math> and <Math mode="inline" tex="d_{AFE}^{i}" text="(d _ (A * F * E)) ^ i" xml:id="S3.SS4.SSS1.p6.m8">
              <XMath>
                <XMApp>
                  <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">d</XMTok>
                    <XMApp>
                      <XMTok meaning="times" role="MULOP">⁢</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">F</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">E</XMTok>
                    </XMApp>
                  </XMApp>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                </XMApp>
              </XMath>
            </Math> denote the cosine similarity distances between features encoded by the GFE and AFE, respectively, and their corresponding centroids.
<!--  %****␣manuscript.tex␣Line␣900␣**** 
     %其中␣$w_c^G$␣和␣$w_c^A$␣表示支持集␣$S$␣中每个类的质心，分别由使用␣GFE␣和␣AFE␣编码样本获得的深度特征计算得出。类似地，$f_{\theta␣}^{G}(x_q)$␣和␣$f_{\theta␣}^{A}(x_q)$␣表示查询集␣$Q$␣中样本的深度特征向量，分别使用␣GFE␣和␣AFE␣编码。$d_{GFE}^{i}$␣和␣$d_{AFE}^{i}$␣分别表示␣GFE␣和␣AFE␣编码的特征与其对应质心之间的余弦相似距离。--></p>
        </para>
        <para xml:id="S3.SS4.SSS1.p7">
          <p>The two data streams are subsequently combined to enhance MER performance. Inspired by Gong <text font="italic">et al.</text> <cite class="ltx_citemacro_cite">[<bibref bibrefs="gong2023meta" separator="," yyseparator=","/>]</cite>, the calculated distances of the dual-streams are added using a weighted-sum:
<!--  %随后将两个数据流合并以增强␣MER␣性能。受␣Gong␣\textit{et␣al.}␣\cite{gong2023meta}␣的启发，使用加权和将双流的计算距离相加： --></p>
        </para>
        <para xml:id="S3.SS4.SSS1.p8">
          <equation xml:id="S3.E13">
            <tags>
              <tag>(13)</tag>
              <tag role="autoref">Equation 13</tag>
              <tag role="refnum">13</tag>
            </tags>
            <Math mode="display" tex="d_{sum}^{i}=d_{GFE}^{i}+\gamma d_{AFE}^{i},\ i\in\{1,2,...,c\}," text="formulae@((d _ (s * u * m)) ^ i = (d _ (G * F * E)) ^ i + gamma * (d _ (A * F * E)) ^ i, i element-of set@(1, 2, ldots, c))" xml:id="S3.E13.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S3.E13.m1.5"/>
                  <XMWrap>
                    <XMDual xml:id="S3.E13.m1.5">
                      <XMApp>
                        <XMTok meaning="formulae"/>
                        <XMRef idref="S3.E13.m1.5.1"/>
                        <XMRef idref="S3.E13.m1.5.2"/>
                      </XMApp>
                      <XMWrap>
                        <XMApp xml:id="S3.E13.m1.5.1">
                          <XMTok meaning="equals" role="RELOP">=</XMTok>
                          <XMApp>
                            <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                            <XMApp>
                              <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                              <XMTok font="italic" role="UNKNOWN">d</XMTok>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">u</XMTok>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">m</XMTok>
                              </XMApp>
                            </XMApp>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                          </XMApp>
                          <XMApp>
                            <XMTok meaning="plus" role="ADDOP">+</XMTok>
                            <XMApp>
                              <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                              <XMApp>
                                <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                <XMTok font="italic" role="UNKNOWN">d</XMTok>
                                <XMApp>
                                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">F</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">E</XMTok>
                                </XMApp>
                              </XMApp>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                            </XMApp>
                            <XMApp>
                              <XMTok meaning="times" role="MULOP">⁢</XMTok>
                              <XMTok font="italic" name="gamma" role="UNKNOWN">γ</XMTok>
                              <XMApp>
                                <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                <XMApp>
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                  <XMTok font="italic" role="UNKNOWN">d</XMTok>
                                  <XMApp>
                                    <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">F</XMTok>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">E</XMTok>
                                  </XMApp>
                                </XMApp>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                              </XMApp>
                            </XMApp>
                          </XMApp>
                        </XMApp>
                        <XMTok role="PUNCT" rpadding="5.0pt">,</XMTok>
                        <XMApp xml:id="S3.E13.m1.5.2">
                          <XMTok meaning="element-of" name="in" role="RELOP">∈</XMTok>
                          <XMTok font="italic" role="UNKNOWN">i</XMTok>
                          <XMDual>
                            <XMApp>
                              <XMTok meaning="set"/>
                              <XMRef idref="S3.E13.m1.1"/>
                              <XMRef idref="S3.E13.m1.2"/>
                              <XMRef idref="S3.E13.m1.3"/>
                              <XMRef idref="S3.E13.m1.4"/>
                            </XMApp>
                            <XMWrap>
                              <XMTok role="OPEN" stretchy="false">{</XMTok>
                              <XMTok meaning="1" role="NUMBER" xml:id="S3.E13.m1.1">1</XMTok>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMTok meaning="2" role="NUMBER" xml:id="S3.E13.m1.2">2</XMTok>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMTok name="ldots" role="ID" xml:id="S3.E13.m1.3">…</XMTok>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMTok font="italic" role="UNKNOWN" xml:id="S3.E13.m1.4">c</XMTok>
                              <XMTok role="CLOSE" stretchy="false">}</XMTok>
                            </XMWrap>
                          </XMDual>
                        </XMApp>
                      </XMWrap>
                    </XMDual>
                    <XMTok role="PUNCT">,</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
          <p>where <Math mode="inline" tex="d_{GFE}^{i}" text="(d _ (G * F * E)) ^ i" xml:id="S3.SS4.SSS1.p8.m1">
              <XMath>
                <XMApp>
                  <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">d</XMTok>
                    <XMApp>
                      <XMTok meaning="times" role="MULOP">⁢</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">F</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">E</XMTok>
                    </XMApp>
                  </XMApp>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                </XMApp>
              </XMath>
            </Math> and <Math mode="inline" tex="d_{AFE}^{i}" text="(d _ (A * F * E)) ^ i" xml:id="S3.SS4.SSS1.p8.m2">
              <XMath>
                <XMApp>
                  <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">d</XMTok>
                    <XMApp>
                      <XMTok meaning="times" role="MULOP">⁢</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">F</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">E</XMTok>
                    </XMApp>
                  </XMApp>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                </XMApp>
              </XMath>
            </Math> denote the cosine similarity distances between the feature of the <Math mode="inline" tex="i" text="i" xml:id="S3.SS4.SSS1.p8.m3">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">i</XMTok>
              </XMath>
            </Math>-<Math mode="inline" tex="th" text="t * h" xml:id="S3.SS4.SSS1.p8.m4">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">t</XMTok>
                  <XMTok font="italic" role="UNKNOWN">h</XMTok>
                </XMApp>
              </XMath>
            </Math> sample, encoded by GFE and AFE, respectively, and their corresponding centroids. The weighted distance <Math mode="inline" tex="d_{sum}^{i}" text="(d _ (s * u * m)) ^ i" xml:id="S3.SS4.SSS1.p8.m5">
              <XMath>
                <XMApp>
                  <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">d</XMTok>
                    <XMApp>
                      <XMTok meaning="times" role="MULOP">⁢</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">u</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">m</XMTok>
                    </XMApp>
                  </XMApp>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                </XMApp>
              </XMath>
            </Math> is then utilized to identify the nearest neighbour and predict the ME class label. Here, <Math mode="inline" tex="\gamma" text="gamma" xml:id="S3.SS4.SSS1.p8.m6">
              <XMath>
                <XMTok font="italic" name="gamma" role="UNKNOWN">γ</XMTok>
              </XMath>
            </Math> represents the weighting coefficient, whose optimal value is determined through experimental validation as detailed in Section <ref labelref="LABEL:Results_and_analysis"/>.
<!--  %其中␣$d_{GFE}^{i}$␣和␣$d_{AFE}^{i}$␣分别表示␣GFE␣和␣AFE␣编码的第i个样本的特征与其对应质心之间的余弦相似性距离。随后使用加权距离␣$d_{sum}^{i}$␣来确定最近邻居并预测␣ME␣类别标签。这里，$\gamma$␣是权重，其最优取值通过第五章的实验获得。 --></p>
        </para>
        <para xml:id="S3.SS4.SSS1.p9">
          <p>(v) The classification module. The classification process involves computing the Euclidean distance between the feature representations of the query set and the centroid vectors of the support set. A nearest-neighbor approach is then employed to classify the samples in the query set, as illustrated in Fig. <ref labelref="LABEL:figs:MPFNet-P"/>(e). Further details are provided in Section <ref labelref="LABEL:sec:_classification"/>.
<!--  %(v)␣分类模块：通过计算查询集特征与支持集中心向量之间的欧几里得距离，并采用最近邻方法对查询集中的样本进行分类，如图␣\ref{fig:MPFNet-P}(e)␣所示，将在E节进行详细介绍。 --></p>
        </para>
        <figure inlist="lof" labels="LABEL:figs:MPFNet-C" placement="t" xml:id="S3.F7">
          <tags>
            <tag>Fig. 7</tag>
            <tag role="autoref">Figure 7</tag>
            <tag role="refnum">7</tag>
            <tag role="typerefnum">Fig. 7</tag>
          </tags>
          <graphics candidates="MPFNet-C.pdf" class="ltx_centering" graphic="MPFNet-C.pdf" options="width=368.577pt" xml:id="S3.F7.g1"/>
          <toccaption class="ltx_centering"><tag close=" ">7</tag>The architecture of MPFNet-C, which shares the same modules as MPFNet-P. The key distinction is that MPFNet-C employs a cascaded encoder architecture that progressively refines feature representations, enabling more effective capture of subtle and discriminative features.</toccaption>
          <caption class="ltx_centering"><tag close=": ">Fig. 7</tag>The architecture of MPFNet-C, which shares the same modules as MPFNet-P. The key distinction is that MPFNet-C employs a cascaded encoder architecture that progressively refines feature representations, enabling more effective capture of subtle and discriminative features.</caption>
        </figure>
      </subsubsection>
      <subsubsection inlist="toc" xml:id="S3.SS4.SSS2">
        <tags>
          <tag>3.4.2</tag>
          <tag role="autoref">subsubsection 3.4.2</tag>
          <tag role="refnum">3.4.2</tag>
          <tag role="typerefnum">§3.4.2</tag>
        </tags>
        <title><tag close=" ">3.4.2</tag>MPFNet-C</title>
        <para xml:id="S3.SS4.SSS2.p1">
          <p>MPFNet-C retains the same five core modules as MPFNet-P. The primary distinction lies in its multi-prior fusion module, where the GFE and AFE are structured in a cascaded architecture. Furthermore, a residual structure is integrated to facilitate feature transmission and fusion, enhancing information flow across encoding layers, reducing potential feature loss, and improving representational capacity. The architectural details are illustrated in Fig. <ref labelref="LABEL:figs:MPFNet-C"/>.
<!--  %MPFNet-C␣保留了与␣MPFNet-P␣相同的五个核心模块。主要区别在于其多先验融合模块，其中␣GFE␣和␣AFE␣以级联架构构建。此外，还集成了残差结构以促进特征传输和融合，增强跨编码层的信息流，减少潜在的特征损失，并提高表示能力。架构细节如图␣\ref{fig:MPFNet-C}␣所示。 
     %****␣manuscript.tex␣Line␣925␣****-->In MPFNet-C, the support set samples undergo data preprocessing to obtain the initial features <Math mode="inline" tex="x_{s}" text="x _ s" xml:id="S3.SS4.SSS2.p1.m1">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                </XMApp>
              </XMath>
            </Math>. Subsequently, <Math mode="inline" tex="x_{s}" text="x _ s" xml:id="S3.SS4.SSS2.p1.m2">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                </XMApp>
              </XMath>
            </Math> is encoded by the GFE, producing the general feature representation <Math mode="inline" tex="f_{\theta}^{G}(x_{s})" text="(f _ theta) ^ G * x _ s" xml:id="S3.SS4.SSS2.p1.m3">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMApp>
                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                    <XMApp>
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">f</XMTok>
                      <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                    </XMApp>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                  </XMApp>
                  <XMDual>
                    <XMRef idref="S3.SS4.SSS2.p1.m3.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS4.SSS2.p1.m3.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> for MER. Next, we concatenate <Math mode="inline" tex="f_{\theta}^{G}(x_{s})" text="(f _ theta) ^ G * x _ s" xml:id="S3.SS4.SSS2.p1.m4">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMApp>
                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                    <XMApp>
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">f</XMTok>
                      <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                    </XMApp>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                  </XMApp>
                  <XMDual>
                    <XMRef idref="S3.SS4.SSS2.p1.m4.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS4.SSS2.p1.m4.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> with the original feature <Math mode="inline" tex="x_{s}" text="x _ s" xml:id="S3.SS4.SSS2.p1.m5">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                </XMApp>
              </XMath>
            </Math> along the channel dimension to form the intermediate feature representation <Math mode="inline" tex="f_{\theta}^{\prime}(x_{s})" text="(f _ theta) ^ prime * x _ s" xml:id="S3.SS4.SSS2.p1.m6">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMApp>
                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                    <XMApp>
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">f</XMTok>
                      <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                    </XMApp>
                    <XMTok fontsize="70%" name="prime" role="SUPOP">′</XMTok>
                  </XMApp>
                  <XMDual>
                    <XMRef idref="S3.SS4.SSS2.p1.m6.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS4.SSS2.p1.m6.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math>. This representation is then fed into the AFE for further encoding, yielding the advanced ME feature <Math mode="inline" tex="f_{\theta}^{A}(x_{s})" text="(f _ theta) ^ A * x _ s" xml:id="S3.SS4.SSS2.p1.m7">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMApp>
                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                    <XMApp>
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">f</XMTok>
                      <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                    </XMApp>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                  </XMApp>
                  <XMDual>
                    <XMRef idref="S3.SS4.SSS2.p1.m7.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS4.SSS2.p1.m7.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math>. The process can be expressed as follows:
<!--  %在␣MPFNet-C␣中，支持集样本经过数据预处理，得到初始特征␣$x_s$。随后，$x_s$␣经过␣GFE␣编码，生成用于␣ME␣分类的通用特征表示␣$f_{\theta}^{G}(x_{s})$。接下来，我们将␣$f_{\theta}^{G}(x_{s})$␣与原始特征␣$x_s$␣沿通道维度连接起来，形成中间特征表示␣$f_{\theta}^{\prime}(x_{s})$。然后，将该表示输入到␣AFE␣中进行进一步编码，得到高级␣ME␣特征␣$f_{\theta}^{A}(x_{s})$。该过程可以表示如下： --></p>
        </para>
        <para xml:id="S3.SS4.SSS2.p2">
          <equationgroup class="ltx_eqn_align" xml:id="Sx1.EGx2">
            <equation xml:id="S3.E14">
              <tags>
                <tag>(14)</tag>
                <tag role="autoref">Equation 14</tag>
                <tag role="refnum">14</tag>
              </tags>
              <MathFork>
                <Math tex="\displaystyle f_{\theta}^{G}(x_{s})=GFE(x_{s})," text="(f _ theta) ^ G * x _ s = G * F * E * x _ s" xml:id="S3.E14.m2">
                  <XMath>
                    <XMDual>
                      <XMRef idref="S3.E14.m2.1"/>
                      <XMWrap>
                        <XMApp xml:id="S3.E14.m2.1">
                          <XMTok meaning="equals" role="RELOP">=</XMTok>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMApp>
                              <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                              <XMApp>
                                <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                              </XMApp>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                            </XMApp>
                            <XMDual>
                              <XMRef idref="S3.E14.m2.1.1"/>
                              <XMWrap>
                                <XMTok role="OPEN" stretchy="false">(</XMTok>
                                <XMApp xml:id="S3.E14.m2.1.1">
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                </XMApp>
                                <XMTok role="CLOSE" stretchy="false">)</XMTok>
                              </XMWrap>
                            </XMDual>
                          </XMApp>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok font="italic" role="UNKNOWN">G</XMTok>
                            <XMTok font="italic" role="UNKNOWN">F</XMTok>
                            <XMTok font="italic" role="UNKNOWN">E</XMTok>
                            <XMDual>
                              <XMRef idref="S3.E14.m2.1.2"/>
                              <XMWrap>
                                <XMTok role="OPEN" stretchy="false">(</XMTok>
                                <XMApp xml:id="S3.E14.m2.1.2">
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                </XMApp>
                                <XMTok role="CLOSE" stretchy="false">)</XMTok>
                              </XMWrap>
                            </XMDual>
                          </XMApp>
                        </XMApp>
                        <XMTok role="PUNCT">,</XMTok>
                      </XMWrap>
                    </XMDual>
                  </XMath>
                </Math>
                <MathBranch>
                  <td align="left"><Math mode="inline" tex="\displaystyle f_{\theta}^{G}(x_{s})=GFE(x_{s})," text="(f _ theta) ^ G * x _ s = G * F * E * x _ s" xml:id="S3.E14.m1">
                      <XMath>
                        <XMDual>
                          <XMRef idref="S3.E14.m1.1"/>
                          <XMWrap>
                            <XMApp xml:id="S3.E14.m1.1">
                              <XMTok meaning="equals" role="RELOP">=</XMTok>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMApp>
                                  <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                  <XMApp>
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                    <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                    <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                  </XMApp>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                </XMApp>
                                <XMDual>
                                  <XMRef idref="S3.E14.m1.1.1"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E14.m1.1.1">
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMTok font="italic" role="UNKNOWN">G</XMTok>
                                <XMTok font="italic" role="UNKNOWN">F</XMTok>
                                <XMTok font="italic" role="UNKNOWN">E</XMTok>
                                <XMDual>
                                  <XMRef idref="S3.E14.m1.1.2"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E14.m1.1.2">
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                            </XMApp>
                            <XMTok role="PUNCT">,</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMath>
                    </Math></td>
                </MathBranch>
              </MathFork>
            </equation>
            <equation xml:id="S3.E15">
              <tags>
                <tag>(15)</tag>
                <tag role="autoref">Equation 15</tag>
                <tag role="refnum">15</tag>
              </tags>
              <MathFork>
                <Math tex="\displaystyle f_{\theta}^{\prime}(x_{s})=Concat(x_{s},f_{\theta}^{G}(x_{s}))," text="(f _ theta) ^ prime * x _ s = C * o * n * c * a * t * open-interval@(x _ s, (f _ theta) ^ G * x _ s)" xml:id="S3.E15.m2">
                  <XMath>
                    <XMDual>
                      <XMRef idref="S3.E15.m2.1"/>
                      <XMWrap>
                        <XMApp xml:id="S3.E15.m2.1">
                          <XMTok meaning="equals" role="RELOP">=</XMTok>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMApp>
                              <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                              <XMApp>
                                <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                              </XMApp>
                              <XMTok fontsize="70%" name="prime" role="SUPOP">′</XMTok>
                            </XMApp>
                            <XMDual>
                              <XMRef idref="S3.E15.m2.1.1"/>
                              <XMWrap>
                                <XMTok role="OPEN" stretchy="false">(</XMTok>
                                <XMApp xml:id="S3.E15.m2.1.1">
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                </XMApp>
                                <XMTok role="CLOSE" stretchy="false">)</XMTok>
                              </XMWrap>
                            </XMDual>
                          </XMApp>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok font="italic" role="UNKNOWN">C</XMTok>
                            <XMTok font="italic" role="UNKNOWN">o</XMTok>
                            <XMTok font="italic" role="UNKNOWN">n</XMTok>
                            <XMTok font="italic" role="UNKNOWN">c</XMTok>
                            <XMTok font="italic" role="UNKNOWN">a</XMTok>
                            <XMTok font="italic" role="UNKNOWN">t</XMTok>
                            <XMDual>
                              <XMApp>
                                <XMTok meaning="open-interval"/>
                                <XMRef idref="S3.E15.m2.1.2"/>
                                <XMRef idref="S3.E15.m2.1.3"/>
                              </XMApp>
                              <XMWrap>
                                <XMTok role="OPEN" stretchy="false">(</XMTok>
                                <XMApp xml:id="S3.E15.m2.1.2">
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                </XMApp>
                                <XMTok role="PUNCT">,</XMTok>
                                <XMApp xml:id="S3.E15.m2.1.3">
                                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                  <XMApp>
                                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                    <XMApp>
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                      <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                      <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                    </XMApp>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                  </XMApp>
                                  <XMDual>
                                    <XMRef idref="S3.E15.m2.1.3.1"/>
                                    <XMWrap>
                                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                                      <XMApp xml:id="S3.E15.m2.1.3.1">
                                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                      </XMApp>
                                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                    </XMWrap>
                                  </XMDual>
                                </XMApp>
                                <XMTok role="CLOSE" stretchy="false">)</XMTok>
                              </XMWrap>
                            </XMDual>
                          </XMApp>
                        </XMApp>
                        <XMTok role="PUNCT">,</XMTok>
                      </XMWrap>
                    </XMDual>
                  </XMath>
                </Math>
                <MathBranch>
                  <td align="left"><Math mode="inline" tex="\displaystyle f_{\theta}^{\prime}(x_{s})=Concat(x_{s},f_{\theta}^{G}(x_{s}))," text="(f _ theta) ^ prime * x _ s = C * o * n * c * a * t * open-interval@(x _ s, (f _ theta) ^ G * x _ s)" xml:id="S3.E15.m1">
                      <XMath>
                        <XMDual>
                          <XMRef idref="S3.E15.m1.1"/>
                          <XMWrap>
                            <XMApp xml:id="S3.E15.m1.1">
                              <XMTok meaning="equals" role="RELOP">=</XMTok>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMApp>
                                  <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                  <XMApp>
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                    <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                    <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                  </XMApp>
                                  <XMTok fontsize="70%" name="prime" role="SUPOP">′</XMTok>
                                </XMApp>
                                <XMDual>
                                  <XMRef idref="S3.E15.m1.1.1"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E15.m1.1.1">
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMTok font="italic" role="UNKNOWN">C</XMTok>
                                <XMTok font="italic" role="UNKNOWN">o</XMTok>
                                <XMTok font="italic" role="UNKNOWN">n</XMTok>
                                <XMTok font="italic" role="UNKNOWN">c</XMTok>
                                <XMTok font="italic" role="UNKNOWN">a</XMTok>
                                <XMTok font="italic" role="UNKNOWN">t</XMTok>
                                <XMDual>
                                  <XMApp>
                                    <XMTok meaning="open-interval"/>
                                    <XMRef idref="S3.E15.m1.1.2"/>
                                    <XMRef idref="S3.E15.m1.1.3"/>
                                  </XMApp>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E15.m1.1.2">
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                    </XMApp>
                                    <XMTok role="PUNCT">,</XMTok>
                                    <XMApp xml:id="S3.E15.m1.1.3">
                                      <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                      <XMApp>
                                        <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                        <XMApp>
                                          <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                          <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                          <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                        </XMApp>
                                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                                      </XMApp>
                                      <XMDual>
                                        <XMRef idref="S3.E15.m1.1.3.1"/>
                                        <XMWrap>
                                          <XMTok role="OPEN" stretchy="false">(</XMTok>
                                          <XMApp xml:id="S3.E15.m1.1.3.1">
                                            <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                            <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                          </XMApp>
                                          <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                        </XMWrap>
                                      </XMDual>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                            </XMApp>
                            <XMTok role="PUNCT">,</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMath>
                    </Math></td>
                </MathBranch>
              </MathFork>
            </equation>
            <equation xml:id="S3.E16">
              <tags>
                <tag>(16)</tag>
                <tag role="autoref">Equation 16</tag>
                <tag role="refnum">16</tag>
              </tags>
              <MathFork>
                <Math tex="\displaystyle f_{\theta}^{A}(x_{s})=AFE(f_{\theta}^{\prime}(x_{s}))." text="(f _ theta) ^ A * x _ s = A * F * E * (f _ theta) ^ prime * x _ s" xml:id="S3.E16.m2">
                  <XMath>
                    <XMDual>
                      <XMRef idref="S3.E16.m2.1"/>
                      <XMWrap>
                        <XMApp xml:id="S3.E16.m2.1">
                          <XMTok meaning="equals" role="RELOP">=</XMTok>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMApp>
                              <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                              <XMApp>
                                <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                              </XMApp>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                            </XMApp>
                            <XMDual>
                              <XMRef idref="S3.E16.m2.1.1"/>
                              <XMWrap>
                                <XMTok role="OPEN" stretchy="false">(</XMTok>
                                <XMApp xml:id="S3.E16.m2.1.1">
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                </XMApp>
                                <XMTok role="CLOSE" stretchy="false">)</XMTok>
                              </XMWrap>
                            </XMDual>
                          </XMApp>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok font="italic" role="UNKNOWN">A</XMTok>
                            <XMTok font="italic" role="UNKNOWN">F</XMTok>
                            <XMTok font="italic" role="UNKNOWN">E</XMTok>
                            <XMDual>
                              <XMRef idref="S3.E16.m2.1.2"/>
                              <XMWrap>
                                <XMTok role="OPEN" stretchy="false">(</XMTok>
                                <XMApp xml:id="S3.E16.m2.1.2">
                                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                  <XMApp>
                                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                    <XMApp>
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                      <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                      <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                    </XMApp>
                                    <XMTok fontsize="70%" name="prime" role="SUPOP">′</XMTok>
                                  </XMApp>
                                  <XMDual>
                                    <XMRef idref="S3.E16.m2.1.2.1"/>
                                    <XMWrap>
                                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                                      <XMApp xml:id="S3.E16.m2.1.2.1">
                                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                      </XMApp>
                                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                    </XMWrap>
                                  </XMDual>
                                </XMApp>
                                <XMTok role="CLOSE" stretchy="false">)</XMTok>
                              </XMWrap>
                            </XMDual>
                          </XMApp>
                        </XMApp>
                        <XMTok role="PERIOD">.</XMTok>
                      </XMWrap>
                    </XMDual>
                  </XMath>
                </Math>
                <MathBranch>
                  <td align="left"><Math mode="inline" tex="\displaystyle f_{\theta}^{A}(x_{s})=AFE(f_{\theta}^{\prime}(x_{s}))." text="(f _ theta) ^ A * x _ s = A * F * E * (f _ theta) ^ prime * x _ s" xml:id="S3.E16.m1">
                      <XMath>
                        <XMDual>
                          <XMRef idref="S3.E16.m1.1"/>
                          <XMWrap>
                            <XMApp xml:id="S3.E16.m1.1">
                              <XMTok meaning="equals" role="RELOP">=</XMTok>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMApp>
                                  <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                  <XMApp>
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                    <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                    <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                  </XMApp>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                </XMApp>
                                <XMDual>
                                  <XMRef idref="S3.E16.m1.1.1"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E16.m1.1.1">
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                              <XMApp>
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMTok font="italic" role="UNKNOWN">A</XMTok>
                                <XMTok font="italic" role="UNKNOWN">F</XMTok>
                                <XMTok font="italic" role="UNKNOWN">E</XMTok>
                                <XMDual>
                                  <XMRef idref="S3.E16.m1.1.2"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E16.m1.1.2">
                                      <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                      <XMApp>
                                        <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                        <XMApp>
                                          <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                          <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                          <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                        </XMApp>
                                        <XMTok fontsize="70%" name="prime" role="SUPOP">′</XMTok>
                                      </XMApp>
                                      <XMDual>
                                        <XMRef idref="S3.E16.m1.1.2.1"/>
                                        <XMWrap>
                                          <XMTok role="OPEN" stretchy="false">(</XMTok>
                                          <XMApp xml:id="S3.E16.m1.1.2.1">
                                            <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                            <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                                          </XMApp>
                                          <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                        </XMWrap>
                                      </XMDual>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                            </XMApp>
                            <XMTok role="PERIOD">.</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMath>
                    </Math></td>
                </MathBranch>
              </MathFork>
            </equation>
          </equationgroup>
        </para>
        <para xml:id="S3.SS4.SSS2.p3">
          <p>Next, for the feature vectors <Math mode="inline" tex="f_{\theta}^{A}(x_{s})" text="(f _ theta) ^ A * x _ s" xml:id="S3.SS4.SSS2.p3.m1">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMApp>
                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                    <XMApp>
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">f</XMTok>
                      <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                    </XMApp>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                  </XMApp>
                  <XMDual>
                    <XMRef idref="S3.SS4.SSS2.p3.m1.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS4.SSS2.p3.m1.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">s</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> in the support set <Math mode="inline" tex="S" text="S" xml:id="S3.SS4.SSS2.p3.m2">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">S</XMTok>
              </XMath>
            </Math>, we compute the mean of features belonging to the same class to obtain the centroid <Math mode="inline" tex="w_{c}" text="w _ c" xml:id="S3.SS4.SSS2.p3.m3">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">w</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                </XMApp>
              </XMath>
            </Math> for each class, as defined in Equation <ref labelref="LABEL:wc"/>. Similarly, the feature vectors <Math mode="inline" tex="f_{\theta}^{G}(x_{q})" text="(f _ theta) ^ G * x _ q" xml:id="S3.SS4.SSS2.p3.m4">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMApp>
                    <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                    <XMApp>
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">f</XMTok>
                      <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                    </XMApp>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">G</XMTok>
                  </XMApp>
                  <XMDual>
                    <XMRef idref="S3.SS4.SSS2.p3.m4.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.SS4.SSS2.p3.m4.1">
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">x</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
              </XMath>
            </Math> of the samples in the query set <Math mode="inline" tex="Q" text="Q" xml:id="S3.SS4.SSS2.p3.m5">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">Q</XMTok>
              </XMath>
            </Math> are computed in the same manner. The distance is then determined based on the cosine similarity between these vectors and the centroids of each class in the embedded space of the support set <Math mode="inline" tex="S" text="S" xml:id="S3.SS4.SSS2.p3.m6">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">S</XMTok>
              </XMath>
            </Math>, as follows:
<!--  %然后，对于支持集$S$中样本的深度特征向量$f_{\theta}^{A}(x_{s})$，我们计算同类别特征的均值，以此获得每个类别的质心$w_c$，如公式␣(10)␣所示。查询集$Q$中样本的深度特征向量␣$f_{\theta}^{G}(x_{q})$以相同方式计算，并基于其与嵌入空间中支持集$S$各类别质心之间的余弦相似度计算距离，具体如下： --></p>
        </para>
        <para xml:id="S3.SS4.SSS2.p4">
          <equation xml:id="S3.E17">
            <tags>
              <tag>(17)</tag>
              <tag role="autoref">Equation 17</tag>
              <tag role="refnum">17</tag>
            </tags>
            <Math mode="display" tex="d_{FE}=similarity(f_{\theta}^{A}(x_{q}),w_{c})=\frac{f_{\theta}^{A}(x_{q})%&#10;\cdot w_{c}}{\|f_{\theta}^{A}(x_{q})\|\|w_{c}\|}." text="d _ (F * E) = s * i * m * i * l * a * r * i * t * y * open-interval@((f _ theta) ^ A * x _ q, w _ c) = (((f _ theta) ^ A * x _ q) cdot w _ c) / (norm@((f _ theta) ^ A * x _ q) * norm@(w _ c))" xml:id="S3.E17.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S3.E17.m1.4"/>
                  <XMWrap>
                    <XMApp xml:id="S3.E17.m1.4">
                      <XMTok meaning="multirelation"/>
                      <XMApp>
                        <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                        <XMTok font="italic" role="UNKNOWN">d</XMTok>
                        <XMApp>
                          <XMTok meaning="times" role="MULOP">⁢</XMTok>
                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">F</XMTok>
                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">E</XMTok>
                        </XMApp>
                      </XMApp>
                      <XMTok meaning="equals" role="RELOP">=</XMTok>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMTok font="italic" role="UNKNOWN">s</XMTok>
                        <XMTok font="italic" role="UNKNOWN">i</XMTok>
                        <XMTok font="italic" role="UNKNOWN">m</XMTok>
                        <XMTok font="italic" role="UNKNOWN">i</XMTok>
                        <XMTok font="italic" role="UNKNOWN">l</XMTok>
                        <XMTok font="italic" role="UNKNOWN">a</XMTok>
                        <XMTok font="italic" role="UNKNOWN">r</XMTok>
                        <XMTok font="italic" role="UNKNOWN">i</XMTok>
                        <XMTok font="italic" role="UNKNOWN">t</XMTok>
                        <XMTok font="italic" role="UNKNOWN">y</XMTok>
                        <XMDual>
                          <XMApp>
                            <XMTok meaning="open-interval"/>
                            <XMRef idref="S3.E17.m1.4.1"/>
                            <XMRef idref="S3.E17.m1.4.2"/>
                          </XMApp>
                          <XMWrap>
                            <XMTok role="OPEN" stretchy="false">(</XMTok>
                            <XMApp xml:id="S3.E17.m1.4.1">
                              <XMTok meaning="times" role="MULOP">⁢</XMTok>
                              <XMApp>
                                <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                                <XMApp>
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                  <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                  <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                </XMApp>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                              </XMApp>
                              <XMDual>
                                <XMRef idref="S3.E17.m1.4.1.1"/>
                                <XMWrap>
                                  <XMTok role="OPEN" stretchy="false">(</XMTok>
                                  <XMApp xml:id="S3.E17.m1.4.1.1">
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                                    <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                  </XMApp>
                                  <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                </XMWrap>
                              </XMDual>
                            </XMApp>
                            <XMTok role="PUNCT">,</XMTok>
                            <XMApp xml:id="S3.E17.m1.4.2">
                              <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                              <XMTok font="italic" role="UNKNOWN">w</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                            </XMApp>
                            <XMTok role="CLOSE" stretchy="false">)</XMTok>
                          </XMWrap>
                        </XMDual>
                      </XMApp>
                      <XMTok meaning="equals" role="RELOP">=</XMTok>
                      <XMApp>
                        <XMTok mathstyle="display" meaning="divide" role="FRACOP"/>
                        <XMApp>
                          <XMTok name="cdot" role="MULOP">⋅</XMTok>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMApp>
                              <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                              <XMApp>
                                <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                              </XMApp>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                            </XMApp>
                            <XMDual>
                              <XMRef idref="S3.E17.m1.1"/>
                              <XMWrap>
                                <XMTok role="OPEN" stretchy="false">(</XMTok>
                                <XMApp xml:id="S3.E17.m1.1">
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                </XMApp>
                                <XMTok role="CLOSE" stretchy="false">)</XMTok>
                              </XMWrap>
                            </XMDual>
                          </XMApp>
                          <XMApp>
                            <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                            <XMTok font="italic" role="UNKNOWN">w</XMTok>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                          </XMApp>
                        </XMApp>
                        <XMApp>
                          <XMTok meaning="times" role="MULOP">⁢</XMTok>
                          <XMDual>
                            <XMApp>
                              <XMTok meaning="norm"/>
                              <XMRef idref="S3.E17.m1.2"/>
                            </XMApp>
                            <XMWrap>
                              <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                              <XMApp xml:id="S3.E17.m1.2">
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMApp>
                                  <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                                  <XMApp>
                                    <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                    <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                    <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                  </XMApp>
                                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">A</XMTok>
                                </XMApp>
                                <XMDual>
                                  <XMRef idref="S3.E17.m1.2.1"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E17.m1.2.1">
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">q</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                              <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                            </XMWrap>
                          </XMDual>
                          <XMDual>
                            <XMApp>
                              <XMTok meaning="norm"/>
                              <XMRef idref="S3.E17.m1.3"/>
                            </XMApp>
                            <XMWrap>
                              <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                              <XMApp xml:id="S3.E17.m1.3">
                                <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                              </XMApp>
                              <XMTok meaning="parallel-to" name="||" role="VERTBAR">∥</XMTok>
                            </XMWrap>
                          </XMDual>
                        </XMApp>
                      </XMApp>
                    </XMApp>
                    <XMTok role="PERIOD">.</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
        </para>
<!--  %在␣MPFNet-C中，存在一种残差结构，用于在GFE和AFE之间传递信息。这种设计增强了特征的表示能力，并缓解了深度特征学习中的梯度消失问题。具体而言，该残差结构通过跳跃连接机制在不同编码层之间传输信息，以缓解特征提取过程中可能出现的信息损失问题。此外，该设计有助于增强特征表示能力，提高模型对微表情细微变化的敏感度，从而提升识别精度和泛化能力。 
     %增强信息流：通过将␣GFE␣输出的特征直接传递到␣AFE，避免了信息丢失，使得初步特征与高级特征共同发挥作用。梯度稳定：这种跳跃连接（skip␣connection）有助于梯度在深度网络中的传播，减少梯度消失或梯度爆炸问题。特征复用：GFE␣可能提取了浅层的全局特征，而␣AFE␣关注更细粒度的特征。残差连接可以确保␣GFE␣产生的全局信息不会在␣AFE␣处理过程中丢失。-->      </subsubsection>
    </subsection>
    <subsection inlist="toc" labels="LABEL:sec:_classification" xml:id="S3.SS5">
      <tags>
        <tag>3.5</tag>
        <tag role="autoref">subsection 3.5</tag>
        <tag role="refnum">3.5</tag>
        <tag role="typerefnum">§3.5</tag>
      </tags>
      <title><tag close=" ">3.5</tag><text font="italic">Classification in meta-learning pipeline</text></title>
      <para xml:id="S3.SS5.p1">
        <p>In this study, we consider MER as a few-shot classification problem and use a metric-based meta-learning framework to solve it. The few-shot MER problem is defined as follows, let <Math mode="inline" tex="\tau" text="tau" xml:id="S3.SS5.p1.m1">
            <XMath>
              <XMTok font="italic" name="tau" role="UNKNOWN">τ</XMTok>
            </XMath>
          </Math> denotes an <Math mode="inline" tex="N" text="N" xml:id="S3.SS5.p1.m2">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">N</XMTok>
            </XMath>
          </Math>-way, <Math mode="inline" tex="K" text="K" xml:id="S3.SS5.p1.m3">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">K</XMTok>
            </XMath>
          </Math>-shot few-shot learning task of MER from target domain, which consists of a labeled few-shot support set <Math mode="inline" tex="S" text="S" xml:id="S3.SS5.p1.m4">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">S</XMTok>
            </XMath>
          </Math> and unlabeled query set <Math mode="inline" tex="Q" text="Q" xml:id="S3.SS5.p1.m5">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">Q</XMTok>
            </XMath>
          </Math>. <Math mode="inline" tex="N" text="N" xml:id="S3.SS5.p1.m6">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">N</XMTok>
            </XMath>
          </Math> denotes that the support set are from <Math mode="inline" tex="N" text="N" xml:id="S3.SS5.p1.m7">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">N</XMTok>
            </XMath>
          </Math> different classes, and <Math mode="inline" tex="K" text="K" xml:id="S3.SS5.p1.m8">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">K</XMTok>
            </XMath>
          </Math> is the number of labelled training samples in each class of a task. The query set data samples are also drawn from these <Math mode="inline" tex="N" text="N" xml:id="S3.SS5.p1.m9">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">N</XMTok>
            </XMath>
          </Math> categories, and the goal of an <Math mode="inline" tex="N" text="N" xml:id="S3.SS5.p1.m10">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">N</XMTok>
            </XMath>
          </Math>-way, <Math mode="inline" tex="K" text="K" xml:id="S3.SS5.p1.m11">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">K</XMTok>
            </XMath>
          </Math>-shot classification task is to classify unlabeled samples in the query set as one of the <Math mode="inline" tex="N" text="N" xml:id="S3.SS5.p1.m12">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">N</XMTok>
            </XMath>
          </Math> categories. We use a meta-learning pipeline based on a standard metrics to calculate the cosine similarity distance in embedding space between each data sample in query set <Math mode="inline" tex="Q" text="Q" xml:id="S3.SS5.p1.m13">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">Q</XMTok>
            </XMath>
          </Math> and the centroid of each class in support set <Math mode="inline" tex="S" text="S" xml:id="S3.SS5.p1.m14">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">S</XMTok>
            </XMath>
          </Math>, and classify the samples in query set using a nearest neighbour method. The model is trained over many episodes to minimize the prediction error over the query set <Math mode="inline" tex="Q" text="Q" xml:id="S3.SS5.p1.m15">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">Q</XMTok>
            </XMath>
          </Math>. In this paper, we set <Math mode="inline" tex="K" text="K" xml:id="S3.SS5.p1.m16">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">K</XMTok>
            </XMath>
          </Math> to 5 following the standard protocol of few-shot image classification problem.
<!--  %在本研究中，我们将MER视为一个少样本分类问题，并使用基于度量的元学习框架来解决它。少样本MER问题定义如下，设$\tau␣$表示一个来自目标域的N-way,␣K-shot少样本MER学习任务，其由标记的少样本支持集S和未标记的查询集Q组成。N-way表示支持集来自N个不同的类，K是任务中每个类中标记训练样本的数量。查询集数据样本也从这N个类别中抽取，N-way,␣K-shot分类任务的目标是将查询集中的未标记样本归类为N个类别之一。我们使用基于标准度量的元学习流程来计算查询集Q中的每个数据样本与支持集S中每个类的质心在嵌入空间中的余弦相似度距离，并使用最近邻方法对查询集中的样本进行分类。该模型经过多次训练，以最小化查询集Q上的预测误差。在本文中，我们按照少样本图像分类问题的标准协议将K设置为␣5。 
     %****␣manuscript.tex␣Line␣950␣****--></p>
      </para>
      <para xml:id="S3.SS5.p2">
        <p>In Section <ref labelref="LABEL:sec:_variants"/>, we compute the distance between the deep feature vectors of the query set samples and the centroids of the clusters formed by the support set samples. The resulting cosine distances are then passed through a softmax function to calculate the probability that each sample <Math mode="inline" tex="x_{i}" text="x _ i" xml:id="S3.SS5.p2.m1">
            <XMath>
              <XMApp>
                <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                <XMTok font="italic" role="UNKNOWN">x</XMTok>
                <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
              </XMApp>
            </XMath>
          </Math> belongs to class <Math mode="inline" tex="c" text="c" xml:id="S3.SS5.p2.m2">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">c</XMTok>
            </XMath>
          </Math>. The calculation formula is as follows:
<!--  %在␣D␣部分中，我们计算查询集样本的深度特征向量与支持集样本所形成的聚类中心之间的距离。然后将得到的余弦距离通过␣softmax␣函数来计算每个样本␣$x_{i}$␣属于类␣$c$␣的概率。详细计算公式如下： --></p>
      </para>
      <para xml:id="S3.SS5.p3">
        <equation xml:id="S3.E18">
          <tags>
            <tag>(18)</tag>
            <tag role="autoref">Equation 18</tag>
            <tag role="refnum">18</tag>
          </tags>
          <Math mode="display" tex="\mathcal{}{p_{i}^{c}}=p(y_{i}=c|x_{i})=\frac{exp(-d(f_{\theta}(x_{i}),w_{c}))}%&#10;{\sum exp(-d(f_{\theta}(x_{i}),w_{c^{{}^{\prime}}})))}," xml:id="S3.E18.m1">
            <XMath>
              <XMApp>
                <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                  <XMTok font="italic" role="UNKNOWN">p</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                </XMApp>
                <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
              </XMApp>
              <XMTok meaning="equals" role="RELOP">=</XMTok>
              <XMTok font="italic" role="UNKNOWN">p</XMTok>
              <XMWrap>
                <XMTok role="OPEN" stretchy="false">(</XMTok>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">y</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                </XMApp>
                <XMTok meaning="equals" role="RELOP">=</XMTok>
                <XMTok font="italic" role="UNKNOWN">c</XMTok>
                <XMTok role="VERTBAR" stretchy="false">|</XMTok>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                </XMApp>
                <XMTok role="CLOSE" stretchy="false">)</XMTok>
              </XMWrap>
              <XMTok meaning="equals" role="RELOP">=</XMTok>
              <XMApp>
                <XMTok mathstyle="display" meaning="divide" role="FRACOP"/>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">e</XMTok>
                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                  <XMTok font="italic" role="UNKNOWN">p</XMTok>
                  <XMDual>
                    <XMRef idref="S3.E18.m1.1"/>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp xml:id="S3.E18.m1.1">
                        <XMTok meaning="minus" role="ADDOP">-</XMTok>
                        <XMApp>
                          <XMTok meaning="times" role="MULOP">⁢</XMTok>
                          <XMTok font="italic" role="UNKNOWN">d</XMTok>
                          <XMDual>
                            <XMApp>
                              <XMTok meaning="open-interval"/>
                              <XMRef idref="S3.E18.m1.1.1"/>
                              <XMRef idref="S3.E18.m1.1.2"/>
                            </XMApp>
                            <XMWrap>
                              <XMTok role="OPEN" stretchy="false">(</XMTok>
                              <XMApp xml:id="S3.E18.m1.1.1">
                                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                                <XMApp>
                                  <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                  <XMTok font="italic" role="UNKNOWN">f</XMTok>
                                  <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                                </XMApp>
                                <XMDual>
                                  <XMRef idref="S3.E18.m1.1.1.1"/>
                                  <XMWrap>
                                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                                    <XMApp xml:id="S3.E18.m1.1.1.1">
                                      <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                                    </XMApp>
                                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                                  </XMWrap>
                                </XMDual>
                              </XMApp>
                              <XMTok role="PUNCT">,</XMTok>
                              <XMApp xml:id="S3.E18.m1.1.2">
                                <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                                <XMTok font="italic" role="UNKNOWN">w</XMTok>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                              </XMApp>
                              <XMTok role="CLOSE" stretchy="false">)</XMTok>
                            </XMWrap>
                          </XMDual>
                        </XMApp>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                  </XMDual>
                </XMApp>
                <XMArg>
                  <XMTok mathstyle="text" meaning="sum" role="SUMOP" scriptpos="post">∑</XMTok>
                  <XMTok font="italic" role="UNKNOWN">e</XMTok>
                  <XMTok font="italic" role="UNKNOWN">x</XMTok>
                  <XMTok font="italic" role="UNKNOWN">p</XMTok>
                  <XMWrap>
                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                    <XMTok meaning="minus" role="ADDOP">-</XMTok>
                    <XMTok font="italic" role="UNKNOWN">d</XMTok>
                    <XMWrap>
                      <XMTok role="OPEN" stretchy="false">(</XMTok>
                      <XMApp>
                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                        <XMTok font="italic" role="UNKNOWN">f</XMTok>
                        <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                      </XMApp>
                      <XMWrap>
                        <XMTok role="OPEN" stretchy="false">(</XMTok>
                        <XMApp>
                          <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                          <XMTok font="italic" role="UNKNOWN">x</XMTok>
                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                        </XMApp>
                        <XMTok role="CLOSE" stretchy="false">)</XMTok>
                      </XMWrap>
                      <XMTok role="PUNCT">,</XMTok>
                      <XMApp>
                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                        <XMTok font="italic" role="UNKNOWN">w</XMTok>
                        <XMApp>
                          <XMTok role="SUPERSCRIPTOP" scriptpos="post3"/>
                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                          <XMApp role="FLOATSUPERSCRIPT" scriptpos="4">
                            <XMTok fontsize="50%" name="prime" role="SUPOP">′</XMTok>
                          </XMApp>
                        </XMApp>
                      </XMApp>
                      <XMTok role="CLOSE" stretchy="false">)</XMTok>
                    </XMWrap>
                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                  </XMWrap>
                  <XMTok role="CLOSE" stretchy="false">)</XMTok>
                </XMArg>
              </XMApp>
              <XMTok role="PUNCT">,</XMTok>
            </XMath>
          </Math>
        </equation>
        <p>where <Math mode="inline" tex="d(f_{\theta}(x_{i}),w_{c}))" xml:id="S3.SS5.p3.m1">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">d</XMTok>
              <XMWrap>
                <XMTok role="OPEN" stretchy="false">(</XMTok>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">f</XMTok>
                  <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                </XMApp>
                <XMWrap>
                  <XMTok role="OPEN" stretchy="false">(</XMTok>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">x</XMTok>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                  </XMApp>
                  <XMTok role="CLOSE" stretchy="false">)</XMTok>
                </XMWrap>
                <XMTok role="PUNCT">,</XMTok>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">w</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                </XMApp>
                <XMTok role="CLOSE" stretchy="false">)</XMTok>
              </XMWrap>
              <XMTok role="CLOSE" stretchy="false">)</XMTok>
            </XMath>
          </Math> represents the distance between the encoded feature of the query set, <Math mode="inline" tex="f_{\theta}(x_{i})" text="f _ theta * x _ i" xml:id="S3.SS5.p3.m2">
            <XMath>
              <XMApp>
                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">f</XMTok>
                  <XMTok font="italic" fontsize="70%" name="theta" role="UNKNOWN">θ</XMTok>
                </XMApp>
                <XMDual>
                  <XMRef idref="S3.SS5.p3.m2.1"/>
                  <XMWrap>
                    <XMTok role="OPEN" stretchy="false">(</XMTok>
                    <XMApp xml:id="S3.SS5.p3.m2.1">
                      <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                      <XMTok font="italic" role="UNKNOWN">x</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                    </XMApp>
                    <XMTok role="CLOSE" stretchy="false">)</XMTok>
                  </XMWrap>
                </XMDual>
              </XMApp>
            </XMath>
          </Math>, and the centroid vector <Math mode="inline" tex="w_{c}" text="w _ c" xml:id="S3.SS5.p3.m3">
            <XMath>
              <XMApp>
                <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                <XMTok font="italic" role="UNKNOWN">w</XMTok>
                <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
              </XMApp>
            </XMath>
          </Math>.
<!--  %其中␣$d(f_{\theta}(x_{i}),w_{c}))$␣表示查询集的编码特征␣$f_{\theta}(x_{i})$␣与质心向量␣$w_{c}$␣之间的距离。 -->The classification loss function is defined as the cross-entropy loss between the predicted distribution and the ground truth distribution of the query set:
<!--  %我们将查询集的预测分布与真实分布之间的交叉熵损失定义为分类的损失函数： --></p>
      </para>
      <para xml:id="S3.SS5.p4">
        <equation xml:id="S3.E19">
          <tags>
            <tag>(19)</tag>
            <tag role="autoref">Equation 19</tag>
            <tag role="refnum">19</tag>
          </tags>
          <Math mode="display" tex="\mathcal{}{L_{c}}=-\sum_{i=1}^{n}Y_{i}\log{P_{i}}," text="L _ c = - ((sum _ (i = 1)) ^ n)@(Y _ i * logarithm@(P _ i))" xml:id="S3.E19.m1">
            <XMath>
              <XMDual>
                <XMRef idref="S3.E19.m1.1"/>
                <XMWrap>
                  <XMApp xml:id="S3.E19.m1.1">
                    <XMTok meaning="equals" role="RELOP">=</XMTok>
                    <XMApp>
                      <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                      <XMTok font="italic" role="UNKNOWN">L</XMTok>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                    </XMApp>
                    <XMApp>
                      <XMTok meaning="minus" role="ADDOP">-</XMTok>
                      <XMApp>
                        <XMApp scriptpos="mid">
                          <XMTok role="SUPERSCRIPTOP" scriptpos="mid1"/>
                          <XMApp scriptpos="mid">
                            <XMTok role="SUBSCRIPTOP" scriptpos="mid1"/>
                            <XMTok mathstyle="display" meaning="sum" role="SUMOP" scriptpos="mid">∑</XMTok>
                            <XMApp>
                              <XMTok fontsize="70%" meaning="equals" role="RELOP">=</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                              <XMTok fontsize="70%" meaning="1" role="NUMBER">1</XMTok>
                            </XMApp>
                          </XMApp>
                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">n</XMTok>
                        </XMApp>
                        <XMApp>
                          <XMTok meaning="times" role="MULOP">⁢</XMTok>
                          <XMApp>
                            <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                            <XMTok font="italic" role="UNKNOWN">Y</XMTok>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                          </XMApp>
                          <XMApp>
                            <XMTok meaning="logarithm" role="OPFUNCTION">log</XMTok>
                            <XMApp>
                              <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                              <XMTok font="italic" role="UNKNOWN">P</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                            </XMApp>
                          </XMApp>
                        </XMApp>
                      </XMApp>
                    </XMApp>
                  </XMApp>
                  <XMTok role="PUNCT">,</XMTok>
                </XMWrap>
              </XMDual>
            </XMath>
          </Math>
        </equation>
        <p>where <Math mode="inline" tex="P_{i}=\left[p_{1}^{i},p_{2}^{i},...,p_{c}^{i}\right]" text="P _ i = list@((p _ 1) ^ i, (p _ 2) ^ i, ldots, (p _ c) ^ i)" xml:id="S3.SS5.p4.m1">
            <XMath>
              <XMApp>
                <XMTok meaning="equals" role="RELOP">=</XMTok>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">P</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                </XMApp>
                <XMDual>
                  <XMApp>
                    <XMTok meaning="list"/>
                    <XMRef idref="S3.SS5.p4.m1.2"/>
                    <XMRef idref="S3.SS5.p4.m1.3"/>
                    <XMRef idref="S3.SS5.p4.m1.1"/>
                    <XMRef idref="S3.SS5.p4.m1.4"/>
                  </XMApp>
                  <XMWrap>
                    <XMTok role="OPEN" stretchy="true">[</XMTok>
                    <XMApp xml:id="S3.SS5.p4.m1.2">
                      <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                      <XMApp>
                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                        <XMTok font="italic" role="UNKNOWN">p</XMTok>
                        <XMTok fontsize="70%" meaning="1" role="NUMBER">1</XMTok>
                      </XMApp>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                    <XMApp xml:id="S3.SS5.p4.m1.3">
                      <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                      <XMApp>
                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                        <XMTok font="italic" role="UNKNOWN">p</XMTok>
                        <XMTok fontsize="70%" meaning="2" role="NUMBER">2</XMTok>
                      </XMApp>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                    <XMTok name="ldots" role="ID" xml:id="S3.SS5.p4.m1.1">…</XMTok>
                    <XMTok role="PUNCT">,</XMTok>
                    <XMApp xml:id="S3.SS5.p4.m1.4">
                      <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                      <XMApp>
                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                        <XMTok font="italic" role="UNKNOWN">p</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                      </XMApp>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                    </XMApp>
                    <XMTok role="CLOSE" stretchy="true">]</XMTok>
                  </XMWrap>
                </XMDual>
              </XMApp>
            </XMath>
          </Math> is the predicted distribution, <Math mode="inline" tex="C" text="C" xml:id="S3.SS5.p4.m2">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">C</XMTok>
            </XMath>
          </Math> is the total number of ME categories, and <Math mode="inline" tex="Y_{i}=\left[y_{1}^{i},y_{2}^{i},...,y_{c}^{i}\right]" text="Y _ i = list@((y _ 1) ^ i, (y _ 2) ^ i, ldots, (y _ c) ^ i)" xml:id="S3.SS5.p4.m3">
            <XMath>
              <XMApp>
                <XMTok meaning="equals" role="RELOP">=</XMTok>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">Y</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                </XMApp>
                <XMDual>
                  <XMApp>
                    <XMTok meaning="list"/>
                    <XMRef idref="S3.SS5.p4.m3.2"/>
                    <XMRef idref="S3.SS5.p4.m3.3"/>
                    <XMRef idref="S3.SS5.p4.m3.1"/>
                    <XMRef idref="S3.SS5.p4.m3.4"/>
                  </XMApp>
                  <XMWrap>
                    <XMTok role="OPEN" stretchy="true">[</XMTok>
                    <XMApp xml:id="S3.SS5.p4.m3.2">
                      <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                      <XMApp>
                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                        <XMTok font="italic" role="UNKNOWN">y</XMTok>
                        <XMTok fontsize="70%" meaning="1" role="NUMBER">1</XMTok>
                      </XMApp>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                    <XMApp xml:id="S3.SS5.p4.m3.3">
                      <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                      <XMApp>
                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                        <XMTok font="italic" role="UNKNOWN">y</XMTok>
                        <XMTok fontsize="70%" meaning="2" role="NUMBER">2</XMTok>
                      </XMApp>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                    <XMTok name="ldots" role="ID" xml:id="S3.SS5.p4.m3.1">…</XMTok>
                    <XMTok role="PUNCT">,</XMTok>
                    <XMApp xml:id="S3.SS5.p4.m3.4">
                      <XMTok role="SUPERSCRIPTOP" scriptpos="post2"/>
                      <XMApp>
                        <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                        <XMTok font="italic" role="UNKNOWN">y</XMTok>
                        <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                      </XMApp>
                      <XMTok font="italic" fontsize="70%" role="UNKNOWN">i</XMTok>
                    </XMApp>
                    <XMTok role="CLOSE" stretchy="true">]</XMTok>
                  </XMWrap>
                </XMDual>
              </XMApp>
            </XMath>
          </Math> is the ground truth distribution of the <Math mode="inline" tex="i" text="i" xml:id="S3.SS5.p4.m4">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">i</XMTok>
            </XMath>
          </Math>-<Math mode="inline" tex="th" text="t * h" xml:id="S3.SS5.p4.m5">
            <XMath>
              <XMApp>
                <XMTok meaning="times" role="MULOP">⁢</XMTok>
                <XMTok font="italic" role="UNKNOWN">t</XMTok>
                <XMTok font="italic" role="UNKNOWN">h</XMTok>
              </XMApp>
            </XMath>
          </Math> data sample.
<!--  %其中␣$P_{i}=\left␣[p_{1}^{i},p_{2}^{i},...,p_{c}^{i}␣\right]$␣为预测分布，$C$␣为␣ME␣类别总数，$Y_{i}=\left␣[y_{1}^{i},y_{2}^{i},...,y_{c}^{i}␣\right]$␣表示第␣$i$␣个数据样本的真实分布。 --></p>
      </para>
<!--  %We␣use␣this␣loss␣to␣update␣dual-streams,␣respectively. -->    </subsection>
  </section>
  <section inlist="toc" labels="LABEL:experiments" xml:id="S4">
    <tags>
      <tag>4</tag>
      <tag role="autoref">section 4</tag>
      <tag role="refnum">4</tag>
      <tag role="typerefnum">§4</tag>
    </tags>
    <title><tag close=" ">4</tag><text font="smallcaps">Experiments</text></title>
    <para xml:id="S4.p1">
      <p>In this Section, we first provide a detailed description of the publicly available ME datasets used in this study. We then explain the classification tasks and evaluation metrics, followed by a discussion of the implementation details for model training and optimization.
<!--  %****␣manuscript.tex␣Line␣975␣**** 
     %在本章中，我们首先提供了本研究中使用的公开ME数据集的详细说明。然后，我们解释分类任务和评估指标，然后讨论模型培训和优化的实施细节。--></p>
    </para>
    <subsection inlist="toc" xml:id="S4.SS1">
      <tags>
        <tag>4.1</tag>
        <tag role="autoref">subsection 4.1</tag>
        <tag role="refnum">4.1</tag>
        <tag role="typerefnum">§4.1</tag>
      </tags>
      <title><tag close=" ">4.1</tag><text font="italic">ME datasets</text></title>
      <para xml:id="S4.SS1.p1">
        <p>We employ three public ME datasets for experimental evaluation: SMIC <cite class="ltx_citemacro_cite">[<bibref bibrefs="pfister2011recognising" separator="," yyseparator=","/>]</cite>, CASME II <cite class="ltx_citemacro_cite">[<bibref bibrefs="yan2014casme" separator="," yyseparator=","/>]</cite> and SAMM <cite class="ltx_citemacro_cite">[<bibref bibrefs="davison2016samm" separator="," yyseparator=","/>]</cite>, along with their composite dataset, MEGC2019-CD <cite class="ltx_citemacro_cite">[<bibref bibrefs="see2019megc" separator="," yyseparator=","/>]</cite>. Below, we provide detailed characteristics of each dataset.
<!--  %在这项研究中，我们采用了三个广泛采用的公共ME数据集进行实验评估：smic␣\␣cite␣{pfister2011识别}，casme␣\␣uppercase␣\␣uppercase␣\␣Expandafter␣\␣ExpandAfter␣{\␣romannumeral2}␣\␣cite␣{yan2014casme}和samm␣\␣cite␣datation␣，3DB兼式\␣cite␣{SEE2019MEGC}。下面，我们提供每个数据集的详细特征。 --></p>
      </para>
      <para xml:id="S4.SS1.p2">
        <p><text font="bold">SMIC</text> : There are 164 ME clips from 16 different subjects at 100 fps in SMIC, with 3 ethnicities. The resolution of samples is 640<Math mode="inline" tex="\times" text="*" xml:id="S4.SS1.p2.m1">
            <XMath>
              <XMTok meaning="times" role="MULOP">×</XMTok>
            </XMath>
          </Math>480 pixels. There are three ME types in SMIC, including negative, positive and surprise.
<!--  %SMIC␣中有␣164␣个␣ME␣片段，来自␣16␣个不同的受试者，帧率为␣100␣fps，涉及␣3␣个种族。样本分辨率为␣640$\times$480␣像素。SMIC␣中有三种␣ME␣类型，包括负面、正面和惊喜。 --></p>
      </para>
      <para xml:id="S4.SS1.p3">
        <p><text font="bold">CASME II</text>: The CASME II dataset contains 256 MEs samples from 26 subjects at 200 fps. There solution of the samples are 640<Math mode="inline" tex="\times" text="*" xml:id="S4.SS1.p3.m1">
            <XMath>
              <XMTok meaning="times" role="MULOP">×</XMTok>
            </XMath>
          </Math>480 pixels. The samples in CASME II are categorized into five ME classes, including happiness, surprise, disgust, repression and others.
<!--  %CASME␣II␣数据集包含来自␣26␣位受试者的␣256␣个␣ME␣样本，帧率为␣200␣fps。样本的分辨率为␣640$\times$480␣像素。CASME␣II␣中的样本分为五个␣ME␣类别，包括快乐、惊讶、厌恶、压抑和其他。 --></p>
      </para>
      <para xml:id="S4.SS1.p4">
        <p><text font="bold">SAMM</text>: The SAMM dataset contains 159 ME instances from 32 participants at 200 fps and the resolution of the samples are 2,040<Math mode="inline" tex="\times" text="*" xml:id="S4.SS1.p4.m1">
            <XMath>
              <XMTok meaning="times" role="MULOP">×</XMTok>
            </XMath>
          </Math>1,088 pixels. The samples in SAMM demonstrates seven ME classes including happiness, surprise, disgust, repression, angry, fear and contempt.
<!--  %SAMM␣数据集包含来自␣32␣位参与者的␣159␣个␣ME␣实例，帧率为␣200␣fps，样本分辨率为␣2,040$\times$1,088␣像素。SAMM␣中的样本展示了七种␣ME␣类别，包括快乐、惊讶、厌恶、压抑、愤怒、恐惧和蔑视。 --></p>
      </para>
      <para xml:id="S4.SS1.p5">
        <p><text font="bold">MEGC2019-CD</text>: The MEGC2019-CD dataset was introduced by the Micro-Expression Grand Challenge 2019 (MEGC2019) <cite class="ltx_citemacro_cite">[<bibref bibrefs="see2019megc" separator="," yyseparator=","/>]</cite>. It integrates three ME datasets—SMIC-HS, CASME II, and SAMM—and categorizes emotions into three groups: Negative (comprising “Repression,” “Anger,” “Contempt,” “Disgust,” “Fear,” and “Sadness”), Positive (“Happiness”), and Surprise (“Surprise”).</p>
      </para>
<!--  %MEGC2019-CD␣数据集是基于␣SMIC␣数据集标签对␣CASME␣II␣和␣SAMM␣进行重新分类和组合。组合数据集有␣68␣个受试者（16␣个来自␣SMIC，24␣个来自␣CASME␣II，28␣个来自␣SAMM），因此数据集包含来自不同背景（种族、环境和性别）的受试者。ME␣的类型分为三类：负面、正面和意外。 -->    </subsection>
    <subsection inlist="toc" xml:id="S4.SS2">
      <tags>
        <tag>4.2</tag>
        <tag role="autoref">subsection 4.2</tag>
        <tag role="refnum">4.2</tag>
        <tag role="typerefnum">§4.2</tag>
      </tags>
      <title><tag close=" ">4.2</tag><text font="italic">Tasks and metrics</text></title>
      <para xml:id="S4.SS2.p1">
        <p>Referring to previous research and the MEGC 2019, we conducted comprehensive experiments on the SMIC, CASME II, and SAMM datasets, including Single Database Evaluation (SDE) and Composite Database Evaluation (CDE).
<!--  %参考先前的研究和微表情挑战赛（MEGC␣2019），我们对SMIC，CASME␣II和SAMM数据集进行了全面的实验，包括单个数据库评估（SDE）和复合数据库评估（CDE）。 
     %****␣manuscript.tex␣Line␣1000␣****--></p>
      </para>
      <subsubsection inlist="toc" xml:id="S4.SS2.SSS1">
        <tags>
          <tag>4.2.1</tag>
          <tag role="autoref">subsubsection 4.2.1</tag>
          <tag role="refnum">4.2.1</tag>
          <tag role="typerefnum">§4.2.1</tag>
        </tags>
        <title><tag close=" ">4.2.1</tag>The SDE task</title>
        <para xml:id="S4.SS2.SSS1.p1">
          <p>The SDE task involves conducting experiments on each of the three datasets using their original emotion labels. Specifically, the SMIC dataset contains three emotion categories, while the CASME II and SAMM datasets share a common set of five emotion categories.
<!--  %SDE任务涉及使用其原始情感标签对三个数据集进行实验。具体而言，SMIC数据集包含三个情绪类别，而CASME␣II和SAMM数据集共享五个情感类别的共同集。 --></p>
        </para>
      </subsubsection>
      <subsubsection inlist="toc" xml:id="S4.SS2.SSS2">
        <tags>
          <tag>4.2.2</tag>
          <tag role="autoref">subsubsection 4.2.2</tag>
          <tag role="refnum">4.2.2</tag>
          <tag role="typerefnum">§4.2.2</tag>
        </tags>
        <title><tag close=" ">4.2.2</tag>The CDE task</title>
        <para xml:id="S4.SS2.SSS2.p1">
          <p>In the MEGC2019, the original emotion labels of the three datasets were consolidated into three broad categories: negative, positive, and surprise. Following this scheme, we first conducted three-class classification experiments separately on each dataset. Subsequently, to achieve a more comprehensive evaluation, we merged the three datasets into a composite dataset, MEGC2019-CD, and performed further experimental analyses.
<!--  %在第二个微观表达大挑战（MEGC2019）中，三个数据集的原始情感标签被整合在一起：负，积极和惊喜。按照这种设置，我们首先在每个数据集上分别进行了三类分类实验。随后，为了实现更全面的评估，我们将三个数据集合并为复合数据集，即MEGC2019-CD，进行了进一步的实验分析。 --></p>
        </para>
      </subsubsection>
      <subsubsection inlist="toc" xml:id="S4.SS2.SSS3">
        <tags>
          <tag>4.2.3</tag>
          <tag role="autoref">subsubsection 4.2.3</tag>
          <tag role="refnum">4.2.3</tag>
          <tag role="typerefnum">§4.2.3</tag>
        </tags>
        <title><tag close=" ">4.2.3</tag>Evaluation metrics</title>
        <para xml:id="S4.SS2.SSS3.p1">
          <p>To independently evaluate each participant’s samples, we employ the Leave-One-Subject-Out (LOSO) cross-validation method to assess MPFNet’s performance. During each iteration, the model was optimized solely on the training set (i.e., all data except that of the current test subject), and hyperparameter tuning was conducted only on a subset of the training data (i.e., the validation set). Importantly, the test subject’s data was never involved in the hyperparameter tuning process. As a result, the final test outcomes remain unaffected by the tuning process, ensuring the validity and reliability of our evaluation. The training process of both the GFE and AFE was also conducted using LOSO cross-validation.
<!--  %为了独立评估每个参与者的样本，我们采用␣Leave-One-Subject-Out␣（LOSO）␣交叉验证方法来评估␣MPFNet␣的表现。在每次迭代期间，模型仅在训练集（即除当前测试对象之外的所有数据）上进行优化，并且仅在训练数据的子集（即验证集）上进行超参数调整。重要的是，测试对象的数据从未参与超参数优化过程。因此，最终测试结果不受调优过程的影响，确保了我们评估的有效性和可靠性。 -->For the SDE task, we use accuracy (Acc) and F1-score (F1) for evaluation, with the F1-score providing a more objective and persuasive measure due to its robustness to class imbalance, especially in the CASME II and SAMM datasets. For the CDE task, we follow MEGC 2019 and use the unweighted F1-score (UF1) and unweighted average recall (UAR) to assess model performance. In fact, UF1 is commonly referred to as the macro-averaged F1-score, while UAR represents the “balanced accuracy". The calculation methods for these metrics are detailed as follows:
<!--  %对于SDE任务，我们采用准确度(Acc)和F1分数␣(F1)进行评估，其中F1␣分数更客观、更有说服力，因为␣CASME␣II␣和␣SAMM␣中存在严重的类别不平衡。对于CDE任务，我们参考␣MEGC␣2019，使用未加权的␣F1␣分数␣(UF1)␣和未加权平均召回率␣(UAR)␣来评估模型性能。实际上，UF1␣通常也称为宏平均␣F1␣分数，UAR␣是“平衡准确度”，这些指标的计算详述如下： --></p>
        </para>
        <para xml:id="S4.SS2.SSS3.p2">
          <equation xml:id="S4.E20">
            <tags>
              <tag>(20)</tag>
              <tag role="autoref">Equation 20</tag>
              <tag role="refnum">20</tag>
            </tags>
            <Math mode="display" tex="\mathcal{}{Acc_{c}}=\frac{TP_{c}}{N_{c}}," text="A * c * c _ c = (T * P _ c) / N _ c" xml:id="S4.E20.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S4.E20.m1.1"/>
                  <XMWrap>
                    <XMApp xml:id="S4.E20.m1.1">
                      <XMTok meaning="equals" role="RELOP">=</XMTok>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMTok font="italic" role="UNKNOWN">A</XMTok>
                        <XMTok font="italic" role="UNKNOWN">c</XMTok>
                        <XMApp>
                          <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                          <XMTok font="italic" role="UNKNOWN">c</XMTok>
                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                        </XMApp>
                      </XMApp>
                      <XMApp>
                        <XMTok mathstyle="display" meaning="divide" role="FRACOP"/>
                        <XMApp>
                          <XMTok meaning="times" role="MULOP">⁢</XMTok>
                          <XMTok font="italic" role="UNKNOWN">T</XMTok>
                          <XMApp>
                            <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                            <XMTok font="italic" role="UNKNOWN">P</XMTok>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                          </XMApp>
                        </XMApp>
                        <XMApp>
                          <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                          <XMTok font="italic" role="UNKNOWN">N</XMTok>
                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                        </XMApp>
                      </XMApp>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
        </para>
        <para xml:id="S4.SS2.SSS3.p3">
          <equation xml:id="S4.E21">
            <tags>
              <tag>(21)</tag>
              <tag role="autoref">Equation 21</tag>
              <tag role="refnum">21</tag>
            </tags>
            <Math mode="display" tex="\mathcal{}{F1_{c}}=\frac{TP_{c}}{2TP_{c}+FP_{c}+FN_{c}}," text="F * 1 _ c = (T * P _ c) / (2 * T * P _ c + F * P _ c + F * N _ c)" xml:id="S4.E21.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S4.E21.m1.1"/>
                  <XMWrap>
                    <XMApp xml:id="S4.E21.m1.1">
                      <XMTok meaning="equals" role="RELOP">=</XMTok>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMTok font="italic" role="UNKNOWN">F</XMTok>
                        <XMApp>
                          <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                          <XMTok meaning="1" role="NUMBER">1</XMTok>
                          <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                        </XMApp>
                      </XMApp>
                      <XMApp>
                        <XMTok mathstyle="display" meaning="divide" role="FRACOP"/>
                        <XMApp>
                          <XMTok meaning="times" role="MULOP">⁢</XMTok>
                          <XMTok font="italic" role="UNKNOWN">T</XMTok>
                          <XMApp>
                            <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                            <XMTok font="italic" role="UNKNOWN">P</XMTok>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                          </XMApp>
                        </XMApp>
                        <XMApp>
                          <XMTok meaning="plus" role="ADDOP">+</XMTok>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok meaning="2" role="NUMBER">2</XMTok>
                            <XMTok font="italic" role="UNKNOWN">T</XMTok>
                            <XMApp>
                              <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                              <XMTok font="italic" role="UNKNOWN">P</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                            </XMApp>
                          </XMApp>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok font="italic" role="UNKNOWN">F</XMTok>
                            <XMApp>
                              <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                              <XMTok font="italic" role="UNKNOWN">P</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                            </XMApp>
                          </XMApp>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok font="italic" role="UNKNOWN">F</XMTok>
                            <XMApp>
                              <XMTok role="SUBSCRIPTOP" scriptpos="post2"/>
                              <XMTok font="italic" role="UNKNOWN">N</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                            </XMApp>
                          </XMApp>
                        </XMApp>
                      </XMApp>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
<!--  %****␣manuscript.tex␣Line␣1025␣**** -->        </para>
        <para xml:id="S4.SS2.SSS3.p4">
          <equation xml:id="S4.E22">
            <tags>
              <tag>(22)</tag>
              <tag role="autoref">Equation 22</tag>
              <tag role="refnum">22</tag>
            </tags>
            <Math mode="display" tex="\mathcal{}{UF1}=\frac{1}{C}\sum_{c=1}^{C}F1_{c}," text="U * F * 1 = (1 / C) * ((sum _ (c = 1)) ^ C)@(F * 1 _ c)" xml:id="S4.E22.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S4.E22.m1.1"/>
                  <XMWrap>
                    <XMApp xml:id="S4.E22.m1.1">
                      <XMTok meaning="equals" role="RELOP">=</XMTok>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMTok font="italic" role="UNKNOWN">U</XMTok>
                        <XMTok font="italic" role="UNKNOWN">F</XMTok>
                        <XMTok meaning="1" role="NUMBER">1</XMTok>
                      </XMApp>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMApp>
                          <XMTok mathstyle="display" meaning="divide" role="FRACOP"/>
                          <XMTok meaning="1" role="NUMBER">1</XMTok>
                          <XMTok font="italic" role="UNKNOWN">C</XMTok>
                        </XMApp>
                        <XMApp>
                          <XMApp scriptpos="mid">
                            <XMTok role="SUPERSCRIPTOP" scriptpos="mid1"/>
                            <XMApp scriptpos="mid">
                              <XMTok role="SUBSCRIPTOP" scriptpos="mid1"/>
                              <XMTok mathstyle="display" meaning="sum" role="SUMOP" scriptpos="mid">∑</XMTok>
                              <XMApp>
                                <XMTok fontsize="70%" meaning="equals" role="RELOP">=</XMTok>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                <XMTok fontsize="70%" meaning="1" role="NUMBER">1</XMTok>
                              </XMApp>
                            </XMApp>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">C</XMTok>
                          </XMApp>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok font="italic" role="UNKNOWN">F</XMTok>
                            <XMApp>
                              <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                              <XMTok meaning="1" role="NUMBER">1</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                            </XMApp>
                          </XMApp>
                        </XMApp>
                      </XMApp>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
        </para>
        <para xml:id="S4.SS2.SSS3.p5">
          <equation xml:id="S4.E23">
            <tags>
              <tag>(23)</tag>
              <tag role="autoref">Equation 23</tag>
              <tag role="refnum">23</tag>
            </tags>
            <Math mode="display" tex="\mathcal{}{UAR}=\frac{1}{C}\sum_{c=1}^{C}Acc_{c}," text="U * A * R = (1 / C) * ((sum _ (c = 1)) ^ C)@(A * c * c _ c)" xml:id="S4.E23.m1">
              <XMath>
                <XMDual>
                  <XMRef idref="S4.E23.m1.1"/>
                  <XMWrap>
                    <XMApp xml:id="S4.E23.m1.1">
                      <XMTok meaning="equals" role="RELOP">=</XMTok>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMTok font="italic" role="UNKNOWN">U</XMTok>
                        <XMTok font="italic" role="UNKNOWN">A</XMTok>
                        <XMTok font="italic" role="UNKNOWN">R</XMTok>
                      </XMApp>
                      <XMApp>
                        <XMTok meaning="times" role="MULOP">⁢</XMTok>
                        <XMApp>
                          <XMTok mathstyle="display" meaning="divide" role="FRACOP"/>
                          <XMTok meaning="1" role="NUMBER">1</XMTok>
                          <XMTok font="italic" role="UNKNOWN">C</XMTok>
                        </XMApp>
                        <XMApp>
                          <XMApp scriptpos="mid">
                            <XMTok role="SUPERSCRIPTOP" scriptpos="mid1"/>
                            <XMApp scriptpos="mid">
                              <XMTok role="SUBSCRIPTOP" scriptpos="mid1"/>
                              <XMTok mathstyle="display" meaning="sum" role="SUMOP" scriptpos="mid">∑</XMTok>
                              <XMApp>
                                <XMTok fontsize="70%" meaning="equals" role="RELOP">=</XMTok>
                                <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                                <XMTok fontsize="70%" meaning="1" role="NUMBER">1</XMTok>
                              </XMApp>
                            </XMApp>
                            <XMTok font="italic" fontsize="70%" role="UNKNOWN">C</XMTok>
                          </XMApp>
                          <XMApp>
                            <XMTok meaning="times" role="MULOP">⁢</XMTok>
                            <XMTok font="italic" role="UNKNOWN">A</XMTok>
                            <XMTok font="italic" role="UNKNOWN">c</XMTok>
                            <XMApp>
                              <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                              <XMTok font="italic" role="UNKNOWN">c</XMTok>
                              <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                            </XMApp>
                          </XMApp>
                        </XMApp>
                      </XMApp>
                    </XMApp>
                    <XMTok role="PUNCT">,</XMTok>
                  </XMWrap>
                </XMDual>
              </XMath>
            </Math>
          </equation>
          <p>where <Math mode="inline" tex="TP_{c}" text="T * P _ c" xml:id="S4.SS2.SSS3.p5.m1">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">T</XMTok>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">P</XMTok>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                  </XMApp>
                </XMApp>
              </XMath>
            </Math>, <Math mode="inline" tex="FP_{c}" text="F * P _ c" xml:id="S4.SS2.SSS3.p5.m2">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">F</XMTok>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">P</XMTok>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                  </XMApp>
                </XMApp>
              </XMath>
            </Math>, and <Math mode="inline" tex="FN_{c}" text="F * N _ c" xml:id="S4.SS2.SSS3.p5.m3">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">F</XMTok>
                  <XMApp>
                    <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                    <XMTok font="italic" role="UNKNOWN">N</XMTok>
                    <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                  </XMApp>
                </XMApp>
              </XMath>
            </Math> are the numbers of true positives, false positives, and false negatives for the <Math mode="inline" tex="c" text="c" xml:id="S4.SS2.SSS3.p5.m4">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">c</XMTok>
              </XMath>
            </Math>-<Math mode="inline" tex="th" text="t * h" xml:id="S4.SS2.SSS3.p5.m5">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">t</XMTok>
                  <XMTok font="italic" role="UNKNOWN">h</XMTok>
                </XMApp>
              </XMath>
            </Math> class, respectively. <Math mode="inline" tex="N_{c}" text="N _ c" xml:id="S4.SS2.SSS3.p5.m6">
              <XMath>
                <XMApp>
                  <XMTok role="SUBSCRIPTOP" scriptpos="post1"/>
                  <XMTok font="italic" role="UNKNOWN">N</XMTok>
                  <XMTok font="italic" fontsize="70%" role="UNKNOWN">c</XMTok>
                </XMApp>
              </XMath>
            </Math> is the number of samples in the <Math mode="inline" tex="c" text="c" xml:id="S4.SS2.SSS3.p5.m7">
              <XMath>
                <XMTok font="italic" role="UNKNOWN">c</XMTok>
              </XMath>
            </Math>-<Math mode="inline" tex="th" text="t * h" xml:id="S4.SS2.SSS3.p5.m8">
              <XMath>
                <XMApp>
                  <XMTok meaning="times" role="MULOP">⁢</XMTok>
                  <XMTok font="italic" role="UNKNOWN">t</XMTok>
                  <XMTok font="italic" role="UNKNOWN">h</XMTok>
                </XMApp>
              </XMath>
            </Math> class.
<!--  %其中$TP_{c}$、$FP_{c}$、$FN_{c}$分别表示第$c$类分类结果中的真阳性、假阳性、假阴性的数量。$N_{c}$表示第$c$类的样本数量。 --></p>
        </para>
      </subsubsection>
    </subsection>
    <subsection inlist="toc" xml:id="S4.SS3">
      <tags>
        <tag>4.3</tag>
        <tag role="autoref">subsection 4.3</tag>
        <tag role="refnum">4.3</tag>
        <tag role="typerefnum">§4.3</tag>
      </tags>
      <title><tag close=" ">4.3</tag><text font="italic">Implementation details</text></title>
<!--  %During␣the␣data␣preprocessing␣stage,␣we␣initially␣used␣the␣onset␣and␣apex␣frame␣as␣inputs␣to␣the␣VFI␣model,␣generating␣ME␣samples␣with␣a␣fixed␣frame␣count␣of␣11. -->      <para xml:id="S4.SS3.p1">
        <p>In the prior learning stage based on triplet network, we employed an SGD optimizer with a momentum of 0.9 and a learning rate of 0.01. Training was conducted for 60 epochs with a batch size of 128 and a weight decay of 5<Math mode="inline" tex="\times" text="*" xml:id="S4.SS3.p1.m1">
            <XMath>
              <XMTok meaning="times" role="MULOP">×</XMTok>
            </XMath>
          </Math><Math mode="inline" tex="10^{-4}" text="10 ^ (- 4)" xml:id="S4.SS3.p1.m2">
            <XMath>
              <XMApp>
                <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                <XMTok meaning="10" role="NUMBER">10</XMTok>
                <XMApp>
                  <XMTok fontsize="70%" meaning="minus" role="ADDOP">-</XMTok>
                  <XMTok fontsize="70%" meaning="4" role="NUMBER">4</XMTok>
                </XMApp>
              </XMApp>
            </XMath>
          </Math>, resulting in a trained GFE. For the amplified-MEs based prior learning stage, we implemented a motion amplification algorithm to enhance the intensity of subtle movements in ME videos. This phase involves training for 80 epochs with an initial learning rate of 0.001, which was reduced by a factor of ten every 10 epochs. The SGD optimizer, with a momentum of 0.9 and a weight decay of 5<Math mode="inline" tex="\times" text="*" xml:id="S4.SS3.p1.m3">
            <XMath>
              <XMTok meaning="times" role="MULOP">×</XMTok>
            </XMath>
          </Math><Math mode="inline" tex="10^{-4}" text="10 ^ (- 4)" xml:id="S4.SS3.p1.m4">
            <XMath>
              <XMApp>
                <XMTok role="SUPERSCRIPTOP" scriptpos="post1"/>
                <XMTok meaning="10" role="NUMBER">10</XMTok>
                <XMApp>
                  <XMTok fontsize="70%" meaning="minus" role="ADDOP">-</XMTok>
                  <XMTok fontsize="70%" meaning="4" role="NUMBER">4</XMTok>
                </XMApp>
              </XMApp>
            </XMath>
          </Math>, was maintained, resulting in a trained AFE. Within the meta-learning framework, we adopted an episode-based training strategy to enhance the model’s generalization ability under few-shot conditions. Specifically, we designed two few-shot learning configurations for MER: 3-way 5-shot and 5-way 5-shot, which correspond to three-class and five-class classification scenarios, respectively.
<!--  %In␣the␣3-way␣5-shot␣configuration,␣three␣categories␣were␣selected␣from␣the␣dataset,␣with␣five␣samples␣per␣category␣forming␣the␣support␣set.␣Similarly,␣the␣5-way␣5-shot␣configuration␣involves␣selecting␣five␣categories,␣each␣with␣five␣samples␣for␣the␣support␣set. 
     %在基于三元组网络的先前学习阶段，我们采用了一个动量为␣0.9、学习率为␣0.01␣的␣SGD␣优化器。进行了␣60␣个␣epoch␣的训练，批次大小为␣128，权重衰减为␣5$\times$$10^{-4}$，产生了训练的␣GFE。对于基于放大␣MEs␣的先前学习阶段，我们实现了一种运动放大算法来增强␣ME␣视频中细微运动的强度。此阶段涉及␣80␣个␣epoch␣的训练，初始学习率为␣0.001，每␣10␣个␣epoch␣减少␣10␣倍。保持␣SGD␣优化器，动量为␣0.9，权重衰减为␣5$\times$$10^{-4}$，从而得到经过训练的␣AFE。在元学习框架内，我们采用了基于剧集的训练策略，以增强模型在少数镜头条件下的泛化能力。具体来说，我们为␣MER␣设计了两种小样本学习配置：3␣向␣5␣发和␣5␣向␣5␣发，分别对应于三类和五类分类场景。-->The query set samples are then randomly selected from the remaining samples of each category, ensuring no overlap with the support set samples. During training, we utilize the SGD optimizer with a fixed learning rate of 0.05 and a momentum parameter of 0.9. Each training batch contains 4 episodes, with the loss for each task calculated and averaged for gradient updates. All experiments were implemented using PyTorch and executed on an NVIDIA RTX 4090 GPU.
<!--  %然后，从每个类别的其余样本中随机选择查询集样品，以确保与支持集样品没有重叠。在训练过程中，我们利用SGD优化器的固定学习率为0.05，动量参数为0.9。每个培训批次包含4集，每个任务的损失都计算出来，并为梯度更新平均。所有实验均使用Pytorch实施，并在NVIDIA␣RTX␣4090␣GPU上执行。 --></p>
      </para>
      <table inlist="lot" labels="LABEL:table:single_dataset" placement="b" xml:id="S4.T2">
        <tags>
          <tag>TABLE II</tag>
          <tag role="autoref">Table II</tag>
          <tag role="refnum">II</tag>
          <tag role="typerefnum">TABLE II</tag>
        </tags>
        <toccaption class="ltx_centering"><tag close=" ">II</tag>Comparison of MER Performance on the SDE task across different algorithms. The best results are highlighted in bold and the second best results are marked by underline. “–” denotes the results are not reported</toccaption>
        <caption class="ltx_centering"><tag close=": ">TABLE II</tag>Comparison of MER Performance on the SDE task across different algorithms. The best results are highlighted in bold and the second best results are marked by underline. “–” denotes the results are not reported</caption>
        <tabular class="ltx_centering ltx_guessed_headers" vattach="middle">
          <tbody>
            <tr>
              <td align="justify" border="tt t" rowspan="2" thead="row" width="102.4pt">Method</td>
              <td align="center" border="tt t" colspan="2">SMIC (3-class)</td>
              <td align="center" border="tt t" colspan="2">CASME II (5-class)</td>
              <td align="center" border="tt t" colspan="2">SAMM (5-class)</td>
            </tr>
            <tr>
              <td align="justify" border="t" width="45.5pt">Acc</td>
              <td align="justify" border="t" width="45.5pt">F1</td>
              <td align="justify" border="t" width="45.5pt">Acc</td>
              <td align="justify" border="t" width="45.5pt">F1</td>
              <td align="justify" border="t" width="45.5pt">Acc</td>
              <td align="justify" border="t" width="45.5pt">F1</td>
            </tr>
            <tr>
              <td align="justify" border="t" thead="row" width="102.4pt">LBP-TOP (2007) <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhao2007dynamic" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" border="t" width="45.5pt">0.536</td>
              <td align="justify" border="t" width="45.5pt">0.538</td>
              <td align="justify" border="t" width="45.5pt">0.464</td>
              <td align="justify" border="t" width="45.5pt">0.424</td>
              <td align="justify" border="t" width="45.5pt">–</td>
              <td align="justify" border="t" width="45.5pt">–</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="102.4pt">DiSTLBP-RIP (2017) <cite class="ltx_citemacro_cite">[<bibref bibrefs="huang2017discriminative" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="45.5pt">0.634</td>
              <td align="justify" width="45.5pt">–</td>
              <td align="justify" width="45.5pt">0.647</td>
              <td align="justify" width="45.5pt">–</td>
              <td align="justify" width="45.5pt">–</td>
              <td align="justify" width="45.5pt">–</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="102.4pt">Bi-WOOF (2018) <cite class="ltx_citemacro_cite">[<bibref bibrefs="liong2018less" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="45.5pt">0.593</td>
              <td align="justify" width="45.5pt">0.620</td>
              <td align="justify" width="45.5pt">0.589</td>
              <td align="justify" width="45.5pt">0.610</td>
              <td align="justify" width="45.5pt">0.598</td>
              <td align="justify" width="45.5pt">0.591</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="102.4pt">Micro-attention (2020) <cite class="ltx_citemacro_cite">[<bibref bibrefs="wang2020micro" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="45.5pt">0.494</td>
              <td align="justify" width="45.5pt">0.496</td>
              <td align="justify" width="45.5pt">0.659</td>
              <td align="justify" width="45.5pt">0.539</td>
              <td align="justify" width="45.5pt">0.485</td>
              <td align="justify" width="45.5pt">0.402</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="102.4pt">GEME (2021) <cite class="ltx_citemacro_cite">[<bibref bibrefs="nie2021geme" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="45.5pt">0.646</td>
              <td align="justify" width="45.5pt">0.616</td>
              <td align="justify" width="45.5pt">0.752</td>
              <td align="justify" width="45.5pt">0.735</td>
              <td align="justify" width="45.5pt">0.558</td>
              <td align="justify" width="45.5pt">0.454</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="102.4pt">MERSiamC3D (2021) <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhao2021two" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="45.5pt">–</td>
              <td align="justify" width="45.5pt">–</td>
              <td align="justify" width="45.5pt">0.818</td>
              <td align="justify" width="45.5pt"><text framed="underline">0.830</text></td>
              <td align="justify" width="45.5pt">0.687</td>
              <td align="justify" width="45.5pt">0.640</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="102.4pt">FeatRef (2022) <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhou2022feature" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="45.5pt">0.579</td>
              <td align="justify" width="45.5pt">–</td>
              <td align="justify" width="45.5pt">0.628</td>
              <td align="justify" width="45.5pt">–</td>
              <td align="justify" width="45.5pt">0.601</td>
              <td align="justify" width="45.5pt">–</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="102.4pt">Res-CapsNet (2023) <cite class="ltx_citemacro_cite">[<bibref bibrefs="shu2023res" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="45.5pt">0.756</td>
              <td align="justify" width="45.5pt">0.749</td>
              <td align="justify" width="45.5pt">0.763</td>
              <td align="justify" width="45.5pt">0.736</td>
              <td align="justify" width="45.5pt">0.683</td>
              <td align="justify" width="45.5pt">0.543</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="102.4pt">SSRLTS-ViT (2024) <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhang2024facial" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="45.5pt">–</td>
              <td align="justify" width="45.5pt">–</td>
              <td align="justify" width="45.5pt">0.746</td>
              <td align="justify" width="45.5pt">0.736</td>
              <td align="justify" width="45.5pt"><text framed="underline">0.716</text></td>
              <td align="justify" width="45.5pt"><text framed="underline">0.716</text></td>
            </tr>
            <tr>
              <td align="justify" border="t" thead="row" width="102.4pt">MPFNet-P (Ours)</td>
              <td align="justify" border="t" width="45.5pt"><text framed="underline">0.787</text></td>
              <td align="justify" border="t" width="45.5pt"><text framed="underline">0.787</text></td>
              <td align="justify" border="t" width="45.5pt"><text framed="underline">0.820</text></td>
              <td align="justify" border="t" width="45.5pt">0.819</td>
              <td align="justify" border="t" width="45.5pt">0.704</td>
              <td align="justify" border="t" width="45.5pt">0.695</td>
            </tr>
            <tr>
              <td align="justify" border="bb b" thead="row" width="102.4pt">MPFNet-C (Ours)</td>
              <td align="justify" border="bb b" width="45.5pt"><text class="ltx_wrap" font="bold">0.811</text></td>
              <td align="justify" border="bb b" width="45.5pt"><text class="ltx_wrap" font="bold">0.811</text></td>
              <td align="justify" border="bb b" width="45.5pt"><text class="ltx_wrap" font="bold">0.831</text></td>
              <td align="justify" border="bb b" width="45.5pt"><text class="ltx_wrap" font="bold">0.833</text></td>
              <td align="justify" border="bb b" width="45.5pt"><text class="ltx_wrap" font="bold">0.719</text></td>
              <td align="justify" border="bb b" width="45.5pt"><text class="ltx_wrap" font="bold">0.718</text></td>
            </tr>
          </tbody>
        </tabular>
      </table>
      <table inlist="lot" labels="LABEL:table:combined_datasets" placement="b" xml:id="S4.T3">
        <tags>
          <tag>TABLE III</tag>
          <tag role="autoref">Table III</tag>
          <tag role="refnum">III</tag>
          <tag role="typerefnum">TABLE III</tag>
        </tags>
        <toccaption class="ltx_centering"><tag close=" ">III</tag>Comparison of MER Performance on the CDE task across different algorithms (3-class). The best results are highlighted in bold and the second best results are marked by underline</toccaption>
        <caption class="ltx_centering"><tag close=": ">TABLE III</tag>Comparison of MER Performance on the CDE task across different algorithms (3-class). The best results are highlighted in bold and the second best results are marked by underline</caption>
        <tabular class="ltx_centering ltx_guessed_headers" vattach="middle">
          <tbody>
            <tr>
              <td align="justify" border="tt t" rowspan="2" thead="row" width="99.6pt">Method</td>
              <td align="center" border="tt t" colspan="2">SMIC</td>
              <td align="center" border="tt t" colspan="2">CASME II</td>
              <td align="center" border="tt t" colspan="2">SAMM</td>
              <td align="center" border="tt t" colspan="2">MEGC2019-CD</td>
            </tr>
            <tr>
              <td align="justify" border="t" width="31.3pt">UF1</td>
              <td align="justify" border="t" width="31.3pt">UAR</td>
              <td align="justify" border="t" width="31.3pt">UF1</td>
              <td align="justify" border="t" width="31.3pt">UAR</td>
              <td align="justify" border="t" width="31.3pt">UF1</td>
              <td align="justify" border="t" width="31.3pt">UAR</td>
              <td align="justify" border="t" width="31.3pt">UF1</td>
              <td align="justify" border="t" width="31.3pt">UAR</td>
            </tr>
            <tr>
              <td align="justify" border="t" thead="row" width="99.6pt">LBP-TOP (2017) <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhao2007dynamic" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" border="t" width="31.3pt">0.200</td>
              <td align="justify" border="t" width="31.3pt">0.528</td>
              <td align="justify" border="t" width="31.3pt">0.703</td>
              <td align="justify" border="t" width="31.3pt">0.743</td>
              <td align="justify" border="t" width="31.3pt">0.395</td>
              <td align="justify" border="t" width="31.3pt">0.410</td>
              <td align="justify" border="t" width="31.3pt">0.588</td>
              <td align="justify" border="t" width="31.3pt">0.579</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="99.6pt">Bi-WOOF (2018) <cite class="ltx_citemacro_cite">[<bibref bibrefs="liong2018less" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="31.3pt">0.573</td>
              <td align="justify" width="31.3pt">0.583</td>
              <td align="justify" width="31.3pt">0.781</td>
              <td align="justify" width="31.3pt">0.803</td>
              <td align="justify" width="31.3pt">0.521</td>
              <td align="justify" width="31.3pt">0.513</td>
              <td align="justify" width="31.3pt">0.629</td>
              <td align="justify" width="31.3pt">0.623</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="99.6pt">CapsuleNet (2019) <cite class="ltx_citemacro_cite">[<bibref bibrefs="van2019capsulenet" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="31.3pt">0.582</td>
              <td align="justify" width="31.3pt">0.587</td>
              <td align="justify" width="31.3pt">0.707</td>
              <td align="justify" width="31.3pt">0.702</td>
              <td align="justify" width="31.3pt">0.621</td>
              <td align="justify" width="31.3pt">0.598</td>
              <td align="justify" width="31.3pt">0.652</td>
              <td align="justify" width="31.3pt">0.651</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="99.6pt">STSTNet (2019) <cite class="ltx_citemacro_cite">[<bibref bibrefs="liong2019shallow" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="31.3pt">0.680</td>
              <td align="justify" width="31.3pt">0.701</td>
              <td align="justify" width="31.3pt">0.838</td>
              <td align="justify" width="31.3pt">0.869</td>
              <td align="justify" width="31.3pt">0.659</td>
              <td align="justify" width="31.3pt">0.681</td>
              <td align="justify" width="31.3pt">0.735</td>
              <td align="justify" width="31.3pt">0.760</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="99.6pt">RCN-A (2020) <cite class="ltx_citemacro_cite">[<bibref bibrefs="xia2020revealing" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="31.3pt">0.633</td>
              <td align="justify" width="31.3pt">0.644</td>
              <td align="justify" width="31.3pt">0.851</td>
              <td align="justify" width="31.3pt">0.812</td>
              <td align="justify" width="31.3pt">0.760</td>
              <td align="justify" width="31.3pt">0.672</td>
              <td align="justify" width="31.3pt">0.743</td>
              <td align="justify" width="31.3pt">0.719</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="99.6pt">MERSiamC3D (2021) <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhao2021two" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="31.3pt">0.736</td>
              <td align="justify" width="31.3pt">0.760</td>
              <td align="justify" width="31.3pt">0.882</td>
              <td align="justify" width="31.3pt">0.876</td>
              <td align="justify" width="31.3pt">0.748</td>
              <td align="justify" width="31.3pt">0.728</td>
              <td align="justify" width="31.3pt">0.807</td>
              <td align="justify" width="31.3pt">0.799</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="99.6pt">AU-GCN (2021) <cite class="ltx_citemacro_cite">[<bibref bibrefs="lei2021micro" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="31.3pt">0.719</td>
              <td align="justify" width="31.3pt">0.721</td>
              <td align="justify" width="31.3pt">0.879</td>
              <td align="justify" width="31.3pt">0.871</td>
              <td align="justify" width="31.3pt">0.775</td>
              <td align="justify" width="31.3pt">0.789</td>
              <td align="justify" width="31.3pt">0.791</td>
              <td align="justify" width="31.3pt">0.793</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="99.6pt">FeatRef (2022) <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhou2022feature" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="31.3pt">0.701</td>
              <td align="justify" width="31.3pt">0.708</td>
              <td align="justify" width="31.3pt">0.891</td>
              <td align="justify" width="31.3pt">0.887</td>
              <td align="justify" width="31.3pt">0.737</td>
              <td align="justify" width="31.3pt">0.715</td>
              <td align="justify" width="31.3pt">0.783</td>
              <td align="justify" width="31.3pt">0.783</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="99.6pt">Res-CapsNet (2023) <cite class="ltx_citemacro_cite">[<bibref bibrefs="shu2023res" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="31.3pt">0.690</td>
              <td align="justify" width="31.3pt">0.685</td>
              <td align="justify" width="31.3pt">0.812</td>
              <td align="justify" width="31.3pt">0.813</td>
              <td align="justify" width="31.3pt">0.680</td>
              <td align="justify" width="31.3pt">0.686</td>
              <td align="justify" width="31.3pt">0.741</td>
              <td align="justify" width="31.3pt">0.745</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="99.6pt">RNAS-MER (2023) <cite class="ltx_citemacro_cite">[<bibref bibrefs="verma2023rnas" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="31.3pt">0.744</td>
              <td align="justify" width="31.3pt">0.762</td>
              <td align="justify" width="31.3pt">0.898</td>
              <td align="justify" width="31.3pt">0.907</td>
              <td align="justify" width="31.3pt">0.788</td>
              <td align="justify" width="31.3pt">0.823</td>
              <td align="justify" width="31.3pt"><text framed="underline">0.830</text></td>
              <td align="justify" width="31.3pt"><text class="ltx_wrap" font="bold">0.851</text></td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="99.6pt">LAENet (2024) <cite class="ltx_citemacro_cite">[<bibref bibrefs="gan2024laenet" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="31.3pt">0.662</td>
              <td align="justify" width="31.3pt">0.652</td>
              <td align="justify" width="31.3pt"><text framed="underline">0.910</text></td>
              <td align="justify" width="31.3pt"><text framed="underline">0.911</text></td>
              <td align="justify" width="31.3pt">0.681</td>
              <td align="justify" width="31.3pt">0.662</td>
              <td align="justify" width="31.3pt">0.756</td>
              <td align="justify" width="31.3pt">0.740</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="99.6pt">TFT (2024) <cite class="ltx_citemacro_cite">[<bibref bibrefs="wang2024two" separator="," yyseparator=","/>]</cite></td>
              <td align="justify" width="31.3pt">0.741</td>
              <td align="justify" width="31.3pt">0.718</td>
              <td align="justify" width="31.3pt">0.907</td>
              <td align="justify" width="31.3pt">0.909</td>
              <td align="justify" width="31.3pt">0.709</td>
              <td align="justify" width="31.3pt">0.656</td>
              <td align="justify" width="31.3pt">0.814</td>
              <td align="justify" width="31.3pt">0.801</td>
            </tr>
            <tr>
              <td align="justify" border="t" thead="row" width="99.6pt">MPFNet-P (Ours)</td>
              <td align="justify" border="t" width="31.3pt"><text framed="underline">0.781</text></td>
              <td align="justify" border="t" width="31.3pt"><text framed="underline">0.783</text></td>
              <td align="justify" border="t" width="31.3pt">0.879</td>
              <td align="justify" border="t" width="31.3pt">0.895</td>
              <td align="justify" border="t" width="31.3pt"><text framed="underline">0.790</text></td>
              <td align="justify" border="t" width="31.3pt"><text framed="underline">0.835</text></td>
              <td align="justify" border="t" width="31.3pt">0.811</td>
              <td align="justify" border="t" width="31.3pt">0.820</td>
            </tr>
            <tr>
              <td align="justify" border="bb b" thead="row" width="99.6pt">MPFNet-C (Ours)</td>
              <td align="justify" border="bb b" width="31.3pt"><text class="ltx_wrap" font="bold">0.806</text></td>
              <td align="justify" border="bb b" width="31.3pt"><text class="ltx_wrap" font="bold">0.809</text></td>
              <td align="justify" border="bb b" width="31.3pt"><text class="ltx_wrap" font="bold">0.911</text></td>
              <td align="justify" border="bb b" width="31.3pt"><text class="ltx_wrap" font="bold">0.923</text></td>
              <td align="justify" border="bb b" width="31.3pt"><text class="ltx_wrap" font="bold">0.795</text></td>
              <td align="justify" border="bb b" width="31.3pt"><text class="ltx_wrap" font="bold">0.839</text></td>
              <td align="justify" border="bb b" width="31.3pt"><text class="ltx_wrap" font="bold">0.840</text></td>
              <td align="justify" border="bb b" width="31.3pt"><text framed="underline">0.846</text></td>
            </tr>
          </tbody>
        </tabular>
      </table>
<!--  %****␣manuscript.tex␣Line␣1100␣**** -->    </subsection>
  </section>
  <section inlist="toc" labels="LABEL:Results_and_analysis" xml:id="S5">
    <tags>
      <tag>5</tag>
      <tag role="autoref">section 5</tag>
      <tag role="refnum">5</tag>
      <tag role="typerefnum">§5</tag>
    </tags>
    <title><tag close=" ">5</tag><text font="smallcaps">Results and analysis</text></title>
    <para xml:id="S5.p1">
      <p>In this Section, we first evaluate the performance of MPFNet on the SDE and CDE tasks. Next, we conduct ablation experiments to analyze the contributions of different prior knowledge and visual features. We then examine the impact of hyperparameter settings on model performance. Finally, we validate the effectiveness of the multi-prior learning strategy through visual analysis.
<!--  %在本章中，我们首先评估MPFNET在SDE和CDE任务上的性能。接下来，我们进行消融实验，以分析不同先验知识和视觉特征的贡献。然后，我们检查了高参数设置对模型性能的影响。最后，我们通过视觉分析验证了多先验学习策略的有效性。 --></p>
    </para>
    <subsection inlist="toc" xml:id="S5.SS1">
      <tags>
        <tag>5.1</tag>
        <tag role="autoref">subsection 5.1</tag>
        <tag role="refnum">5.1</tag>
        <tag role="typerefnum">§5.1</tag>
      </tags>
      <title><tag close=" ">5.1</tag><text font="italic">Results of the SDE task</text></title>
      <para xml:id="S5.SS1.p1">
        <p>For the SDE task, we conduct a comparative analysis of our MPFNet against several established methods for MER. The comparison encompasses both traditional hand-crafted feature-based approaches, including LBP-TOP <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhao2007dynamic" separator="," yyseparator=","/>]</cite>, DiSTLBP-RIP <cite class="ltx_citemacro_cite">[<bibref bibrefs="huang2017discriminative" separator="," yyseparator=","/>]</cite>, and Bi-WOOF <cite class="ltx_citemacro_cite">[<bibref bibrefs="liong2018less" separator="," yyseparator=","/>]</cite>, as well as deep learning methods such as Micro-attention <cite class="ltx_citemacro_cite">[<bibref bibrefs="wang2020micro" separator="," yyseparator=","/>]</cite>, GEME <cite class="ltx_citemacro_cite">[<bibref bibrefs="nie2021geme" separator="," yyseparator=","/>]</cite>, MERSiamC3D <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhao2021two" separator="," yyseparator=","/>]</cite>, FeatRef <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhou2022feature" separator="," yyseparator=","/>]</cite>, RES-CapsNet <cite class="ltx_citemacro_cite">[<bibref bibrefs="shu2023res" separator="," yyseparator=","/>]</cite>, and SSRLTS-ViT <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhang2024facial" separator="," yyseparator=","/>]</cite>. The experimental results, presented in Table <ref labelref="LABEL:table:single_dataset"/>, indicate the best-performing methods in bold and the second-best with underlining. It can be observed that MPFNet-C achieves the best performance, attaining the highest accuracy and F1 score.
<!--  %对于␣SDE␣任务，我们对所提出的␣MPFNet␣与几种成熟的␣MER␣方法进行了比较分析。比较既包括传统的手工特征方法，包括␣LBP-TOP␣\cite{zhao2007dynamic}、DiSTLBP-RIP␣\cite{huang2017discriminative}␣和␣Bi-WOOF␣\cite{liong2018less}，也包括深度学习方法，例如␣Micro-attention␣\cite{wang2020micro}、GEME␣\cite{nie2021geme}、MERSiamC3D␣\cite{zhao2021two}、FeatRef␣\cite{zhou2022feature}、RES-CapsNet␣\cite{shu2023res}␣和␣SSRLTS-ViT␣\cite{zhang2024facial}。实验结果如表␣\ref{table:single_dataset}␣所示，其中粗体表示表现最佳的方法，下划线表示表现第二好的方法。可以看出，MPFNet-C的表现最优，获得了最高准确率和␣F1␣分数。 --></p>
      </para>
      <para xml:id="S5.SS1.p2">
        <p><text font="bold">Comparison with prior learning-based methods.</text> Experimental results demonstrate that the multi-prior fusion strategy outperforms methods that rely on a single type of prior knowledge, such as GEME and MERSiamC3D. Compared to these two methods, both MPFNet-P and MPFNet-C demonstrate significant advantages in classification accuracy and F1 score across all datasets. For instance, on the SAMM dataset, MPFNet-C improves accuracy by 16.10% compared to GEME and by 3.20% compared to MERSiamC3D. Similarly, its F1-score surpasses that of GEME by 0.264 and MERSiamC3D by 0.078. We attribute this improvement to the fact that MPFNet integrates a more diverse and complementary set of complementary prior knowledge, while GEME relies solely on gender features as prior knowledge, and MERSiamC3D obtains prior knowledge by determining whether sample pairs are the same or different.
<!--  %\textbf{与基于先验学习的方法进行比较。}␣实验结果表明，多先验融合策略的表现优于依靠一种类型的先验知识（例如Geme和Mersiamc3d）的方法。与这两种方法相比，MPFNET-P和MPFNET-C均表现出所有数据集的分类精度和F1分数的显着优势。例如，在SAMM数据集上，MPFNET-C与GEME相比，将分类精度提高了16.10␣\％，与Mersiamc3d相比，MPFNET-C提高了16.10␣\％。同样，其F1得分超过Geme的0.264，Mersiamc3d超过0.078。我们将这种改进归因于MPFNET整合了更多样化和互补的互补先验知识的事实，而Geme仅依赖性别特征作为先验知识，而Mersiamc3d通过确定样本对是否相同或不同来获得先验知识。 --></p>
      </para>
      <para xml:id="S5.SS1.p3">
        <p><text font="bold">Comparison with attention-based methods.</text> Our MPFNet also outperforms several attention-based methods, such as Micro-attention, FeatRef, and Res-CapsNet. For instance, compared to Res-CapsNet, which employs the ECA channel attention module <cite class="ltx_citemacro_cite">[<bibref bibrefs="wang2020eca" separator="," yyseparator=","/>]</cite>, our MPFNet-C achieves an accuracy improvement of 5.50% on the SMIC dataset, 6.80% on the CASME II dataset, and 3.60% on the SAMM dataset. Additionally, MPFNet-C yields an increase of 0.062 in F1-score on SMIC, 0.097 on CASME II, and 0.175 on SAMM. The observed improvement can be attributed to the proposed CA-I3D model, which effectively captures spatiotemporal features and channel-wise information simultaneously, thereby significantly enhancing MER performance.
<!--  %\textbf{与基于注意力的方法进行比较。}␣我们的␣MPFNet␣还优于一些基于注意力机制的方法，例如␣Micro-attention、FeatRef␣和␣Res-CapsNet。例如，与利用␣ECA␣通道注意力模块的Res-CapsNet␣相比，我们的␣MPFNet-C␣在␣SMIC␣上的准确率提高了␣5.50\%，在␣CASME␣II␣上的准确率提高了␣6.80\%，在␣SAMM␣上的准确率提高了␣3.60\%。此外，它在␣SMIC␣上的␣F1␣得分提高了␣0.062，在␣CASME␣II␣上的␣F1␣得分提高了␣0.097，在␣SAMM␣上的␣F1␣得分提高了␣0.175。这归功于我们的␣CA-I3D␣模型同时关注了时空和通道信息，利用注意力机制增强了特征提取能力进而提高了␣MER␣性能。 --></p>
      </para>
      <table inlist="lot" labels="LABEL:tab:ablation-prior" placement="b" xml:id="S5.T4">
        <tags>
          <tag>TABLE IV</tag>
          <tag role="autoref">Table IV</tag>
          <tag role="refnum">IV</tag>
          <tag role="typerefnum">TABLE IV</tag>
        </tags>
        <toccaption class="ltx_centering"><tag close=" ">IV</tag>Ablation study of the prior learning strategy across three datasets. The best results are highlighted in bold</toccaption>
        <caption class="ltx_centering"><tag close=": ">TABLE IV</tag>Ablation study of the prior learning strategy across three datasets. The best results are highlighted in bold</caption>
        <tabular class="ltx_centering ltx_guessed_headers" vattach="middle">
          <thead>
            <tr>
              <td align="justify" border="tt t" rowspan="2" thead="column row" width="91.0pt"><tabular vattach="middle">
                  <tr>
                    <td align="left">Prior learning (PL)</td>
                  </tr>
                  <tr>
                    <td align="left">strategy</td>
                  </tr>
                </tabular></td>
              <td align="center" border="tt t" colspan="2" thead="column"><tabular vattach="middle">
                  <tr>
                    <td align="center">SMIC</td>
                  </tr>
                  <tr>
                    <td align="center">(3-class)</td>
                  </tr>
                </tabular></td>
              <td align="center" border="tt t" colspan="2" thead="column"><tabular vattach="middle">
                  <tr>
                    <td align="center">SMIC</td>
                  </tr>
                  <tr>
                    <td align="center">(5-class)</td>
                  </tr>
                </tabular></td>
              <td align="center" border="tt t" colspan="2" thead="column"><tabular vattach="middle">
                  <tr>
                    <td align="center">CASME II</td>
                  </tr>
                  <tr>
                    <td align="center">(3-class)</td>
                  </tr>
                </tabular></td>
              <td align="center" border="tt t" colspan="2" thead="column"><tabular vattach="middle">
                  <tr>
                    <td align="center">CASME II</td>
                  </tr>
                  <tr>
                    <td align="center">(5-class)</td>
                  </tr>
                </tabular></td>
              <td align="center" border="tt t" colspan="2" thead="column"><tabular vattach="middle">
                  <tr>
                    <td align="center">SAMM</td>
                  </tr>
                  <tr>
                    <td align="center">(3-class)</td>
                  </tr>
                </tabular></td>
              <td align="center" border="tt t" colspan="2" thead="column"><tabular vattach="middle">
                  <tr>
                    <td align="center">SAMM</td>
                  </tr>
                  <tr>
                    <td align="center">(5-class)</td>
                  </tr>
                </tabular></td>
            </tr>
            <tr>
              <td align="center" border="t" thead="column">Acc</td>
              <td align="center" border="t" thead="column">F1</td>
              <td align="center" border="t" thead="column">Acc</td>
              <td align="center" border="t" thead="column">F1</td>
              <td align="center" border="t" thead="column">Acc</td>
              <td align="center" border="t" thead="column">F1</td>
              <td align="center" border="t" thead="column">Acc</td>
              <td align="center" border="t" thead="column">F1</td>
              <td align="center" border="t" thead="column">Acc</td>
              <td align="center" border="t" thead="column">F1</td>
              <td align="center" border="t" thead="column">Acc</td>
              <td align="center" border="t" thead="column">F1</td>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="justify" border="t" thead="row" width="91.0pt">w/o PL</td>
              <td align="justify" border="t" width="17.6pt">0.579</td>
              <td align="justify" border="t" width="17.6pt">0.578</td>
              <td align="justify" border="t" width="17.6pt">0.423</td>
              <td align="justify" border="t" width="17.6pt">0.421</td>
              <td align="justify" border="t" width="17.6pt">0.621</td>
              <td align="justify" border="t" width="17.6pt">0.625</td>
              <td align="justify" border="t" width="17.6pt">0.584</td>
              <td align="justify" border="t" width="17.6pt">0.577</td>
              <td align="justify" border="t" width="17.6pt">0.548</td>
              <td align="justify" border="t" width="17.6pt">0.568</td>
              <td align="justify" border="t" width="17.6pt">0.414</td>
              <td align="justify" border="t" width="17.6pt">0.401</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="91.0pt">PLTN</td>
              <td align="justify" width="17.6pt">0.646</td>
              <td align="justify" width="17.6pt">0.645</td>
              <td align="justify" width="17.6pt">0.493</td>
              <td align="justify" width="17.6pt">0.504</td>
              <td align="justify" width="17.6pt">0.745</td>
              <td align="justify" width="17.6pt">0.750</td>
              <td align="justify" width="17.6pt">0.686</td>
              <td align="justify" width="17.6pt">0.687</td>
              <td align="justify" width="17.6pt">0.706</td>
              <td align="justify" width="17.6pt">0.720</td>
              <td align="justify" width="17.6pt">0.638</td>
              <td align="justify" width="17.6pt">0.641</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="91.0pt">PLSM</td>
              <td align="justify" width="17.6pt">0.701</td>
              <td align="justify" width="17.6pt">0.701</td>
              <td align="justify" width="17.6pt">0.563</td>
              <td align="justify" width="17.6pt">0.571</td>
              <td align="justify" width="17.6pt">0.834</td>
              <td align="justify" width="17.6pt">0.838</td>
              <td align="justify" width="17.6pt">0.739</td>
              <td align="justify" width="17.6pt">0.737</td>
              <td align="justify" width="17.6pt">0.759</td>
              <td align="justify" width="17.6pt">0.771</td>
              <td align="justify" width="17.6pt">0.665</td>
              <td align="justify" width="17.6pt">0.664</td>
            </tr>
            <tr>
              <td align="justify" thead="row" width="91.0pt">MPFNet-P (keep all)</td>
              <td align="justify" width="17.6pt">0.787</td>
              <td align="justify" width="17.6pt">0.787</td>
              <td align="justify" width="17.6pt">0.649</td>
              <td align="justify" width="17.6pt">0.631</td>
              <td align="justify" width="17.6pt">0.897</td>
              <td align="justify" width="17.6pt">0.898</td>
              <td align="justify" width="17.6pt">0.820</td>
              <td align="justify" width="17.6pt">0.819</td>
              <td align="justify" width="17.6pt">0.850</td>
              <td align="justify" width="17.6pt">0.856</td>
              <td align="justify" width="17.6pt">0.704</td>
              <td align="justify" width="17.6pt">0.695</td>
            </tr>
            <tr>
              <td align="justify" border="bb b" thead="row" width="91.0pt">MPFNet-C (keep all)</td>
              <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.811</text></td>
              <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.811</text></td>
              <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.663</text></td>
              <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.652</text></td>
              <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.924</text></td>
              <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.925</text></td>
              <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.835</text></td>
              <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.833</text></td>
              <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.857</text></td>
              <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.863</text></td>
              <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.719</text></td>
              <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.721</text></td>
            </tr>
          </tbody>
        </tabular>
      </table>
      <para xml:id="S5.SS1.p4">
        <p><text font="bold">Comparison with keyframe-based methods.</text> We also compare our approach with several key-frame-based MER methods, including Micro-attention, Res-CapsNet, and SSRLTS-ViT. These methods primarily rely on optical flow information between the onset and apex frames of MEs for feature extraction. Experimental results demonstrate that our video sequence-based approach outperforms these key-frame-based methods. Specifically, MPFNet-P achieves a 5.50% higher accuracy and a 0.062 improvement in F1-score compared to Res-CapsNet on the SMIC dataset. On the CASME II dataset, MPFNet-P improves accuracy by 6.80% and F1-score by 0.097, while on the SAMM dataset, accuracy increases by 3.60% and F1-score by 0.175. These performance gains can be attributed to the ability of video sequence-based methods to more comprehensively capture the continuous temporal dynamics and subtle facial motion variations of MEs, whereas key-frame-based methods may fail to retain such critical information. Additionally, these results further validate the effectiveness of the adopted frame interpolation algorithm, which reconstructs high-quality motion information of MEs, thereby enhancing overall recognition performance.
<!--  %我们也将本研究已与几种基于关键帧的微表情识别方法进行了比较，例如␣Micro-attention、Res-CapsNet␣和␣SSRLTS-ViT。这些方法主要依靠微表情起始帧和顶点帧之间的光流信息进行特征提取。实验结果表明，我们基于视频序列的方法优于这些基于关键帧的方法。具体而言，与␣SMIC␣数据集上的␣Res-CapsNet␣相比，MPFNet-P␣的准确率提高了␣5.50\%，F1␣分数提高了␣0.062。在␣CASME␣II␣数据集上，MPFNet-P␣将准确率提高了␣6.80\%，F1␣分数提高了␣0.097，而在␣SAMM␣数据集上，准确率提高了␣3.60\%，F1␣分数提高了␣0.175。这些性能提升可以归因于基于视频序列的方法能够更全面地捕捉微表情的连续时间动态和细微的面部运动变化，而基于关键帧的方法可能无法保留这些关键信息。此外，这些结果进一步验证了所采用的帧插值算法的有效性，该算法可以重建微表情样本的高质量运动信息，从而提高整体识别性能。 --></p>
      </para>
    </subsection>
    <subsection inlist="toc" xml:id="S5.SS2">
      <tags>
        <tag>5.2</tag>
        <tag role="autoref">subsection 5.2</tag>
        <tag role="refnum">5.2</tag>
        <tag role="typerefnum">§5.2</tag>
      </tags>
      <title><tag close=" ">5.2</tag><text font="italic">Results of the CDE task</text></title>
      <para xml:id="S5.SS2.p1">
        <p>This section further validates the effectiveness of MPFNet on the CDE task. We strictly follow MEGC 2019 and conduct a series of three-classification experiments on the SMIC, CASME II, SAMM datasets, and their composite dataset, MEGC2019-CD. We compare MPFNet with both traditional handcrafted methods, such as LBP-TOP <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhao2007dynamic" separator="," yyseparator=","/>]</cite> and Bi-WOOF <cite class="ltx_citemacro_cite">[<bibref bibrefs="liong2018less" separator="," yyseparator=","/>]</cite>, and deep learning methods, such as CapsuleNet <cite class="ltx_citemacro_cite">[<bibref bibrefs="van2019capsulenet" separator="," yyseparator=","/>]</cite>, STSTNet <cite class="ltx_citemacro_cite">[<bibref bibrefs="liong2019shallow" separator="," yyseparator=","/>]</cite>, RCN-A <cite class="ltx_citemacro_cite">[<bibref bibrefs="xia2020revealing" separator="," yyseparator=","/>]</cite>, MERSiamC3D <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhao2021two" separator="," yyseparator=","/>]</cite>, FeatRef <cite class="ltx_citemacro_cite">[<bibref bibrefs="zhou2022feature" separator="," yyseparator=","/>]</cite>, RES-CapsNet <cite class="ltx_citemacro_cite">[<bibref bibrefs="shu2023res" separator="," yyseparator=","/>]</cite>, RNAS-MER <cite class="ltx_citemacro_cite">[<bibref bibrefs="verma2023rnas" separator="," yyseparator=","/>]</cite>, LAENet <cite class="ltx_citemacro_cite">[<bibref bibrefs="gan2024laenet" separator="," yyseparator=","/>]</cite> and TFT <cite class="ltx_citemacro_cite">[<bibref bibrefs="wang2024two" separator="," yyseparator=","/>]</cite>. The experimental results are presented in Table <ref labelref="LABEL:table:combined_datasets"/>. It can be observed that our method achieves the highest UF1 and UAR scores on most datasets. Compared to state-of-the-art deep learning methods such as LAENet and TFT, MPFNet demonstrates the most significant performance improvements on the SMIC and SAMM datasets. Specifically, on the SMIC dataset, MPFNet-C outperforms LAENet by 0.144 in UF1 and 0.157 in UAR. Similarly, on the SAMM dataset, MPFNet-C achieves UF1 and UAR improvements of 0.144 and 0.177, respectively, over LAENet. However, on the MEGC2019-CD dataset, MPFNet-C demonstrates a slightly lower UAR compared to RNAS-MER. This performance discrepancy may be attributed to the fact that RNAS-MER is specifically optimized for the three-class classification task of the MEGC2019-CD dataset, whereas our model exhibits superior capability in learning fine-grained categories, with its advantages becoming more pronounced in five-class classification tasks.
<!--  %本节进一步验证了MPFNet在CDE任务上的有效性。我们严格遵循MEGC␣2019，在SMIC，CASME␣II，SAMM数据集及其复合数据集MEGC2019-CD上进行了一系列三分类实验。我们将␣MPFNet␣与传统的手工方法（例如␣LBP-TOP␣\cite{zhao2007dynamic}␣和␣Bi-WOOF␣\cite{liong2018less}）和深度学习方法（例如␣CapsuleNet␣\cite{van2019capsulenet}、STSTNet␣\cite{liong2019shallow}、RCN-A␣\cite{xia2020revealing}、MERSiamC3D␣\cite{zhao2021two}、FeatRef␣\cite{zhou2022feature}、RES-CapsNet␣\cite{shu2023res}、RNAS-MER␣\cite{verma2023rnas}、LAENet␣\cite{gan2024laenet}␣和␣TFT␣\cite{wang2024two}）进行了比较。实验结果列于表␣\ref{table:combined_datasets}␣中。可以观察到，我们的方法在大多数数据集上都获得了最高的␣UF1␣和␣UAR␣分数。与␣LAENet␣和␣TFT␣等最先进的深度学习方法相比，MPFNet␣在␣SMIC␣和␣SAMM␣数据集上表现出最显著的性能改进。具体来说，在␣SMIC␣数据集上，MPFNet-C␣在␣UF1␣上比␣LAENet␣高␣0.144，在␣UAR␣上比␣LAENet␣高␣0.157。同样，在␣SAMM␣数据集上，MPFNet-C␣分别比␣LAENet␣实现了␣0.144␣和␣0.177␣的␣UF1␣和␣UAR␣改进。然而，在␣MEGC2019-CD␣数据集上，MPFNet-C␣的␣UAR␣略低于␣RNAS-MER。这可能是因为␣RNAS-MER␣是专门为␣MEGC2019-CD␣上的三类分类任务设计的，而我们的模型特别能够对细粒度类别进行细微学习，其优势在五分类任务中更为明显。 --></p>
      </para>
    </subsection>
    <subsection inlist="toc" labels="LABEL:Ablation_Study" xml:id="S5.SS3">
      <tags>
        <tag>5.3</tag>
        <tag role="autoref">subsection 5.3</tag>
        <tag role="refnum">5.3</tag>
        <tag role="typerefnum">§5.3</tag>
      </tags>
      <title><tag close=" ">5.3</tag><text font="italic">Ablation study</text></title>
      <para xml:id="S5.SS3.p1">
        <p>To assess the effectiveness of the proposed multi-prior fusion strategy and visual features, including optical flow and frame difference, we conduct a series of ablation experiments on the SMIC, CASME II, and SAMM datasets.
<!--  %为了评估提出的多优先融合策略和视觉特征（包括光流和帧差值）的有效性，我们在SMIC，CASME␣II和SAMM数据集上进行了一系列消融实验。 
     %****␣manuscript.tex␣Line␣1150␣****--></p>
      </para>
      <subsubsection inlist="toc" xml:id="S5.SS3.SSS1">
        <tags>
          <tag>5.3.1</tag>
          <tag role="autoref">subsubsection 5.3.1</tag>
          <tag role="refnum">5.3.1</tag>
          <tag role="typerefnum">§5.3.1</tag>
        </tags>
        <title><tag close=" ">5.3.1</tag>The effect of prior learning strategy</title>
        <table inlist="lot" labels="LABEL:table:ablation-visual" placement="t" xml:id="S5.T5">
          <tags>
            <tag>TABLE V</tag>
            <tag role="autoref">Table V</tag>
            <tag role="refnum">V</tag>
            <tag role="typerefnum">TABLE V</tag>
          </tags>
          <toccaption class="ltx_centering"><tag close=" ">V</tag>Ablation study of visual features across three datasets. The best results are highlighted in bold</toccaption>
          <caption class="ltx_centering"><tag close=": ">TABLE V</tag>Ablation study of visual features across three datasets. The best results are highlighted in bold</caption>
          <tabular class="ltx_centering ltx_guessed_headers" vattach="middle">
            <thead>
              <tr>
                <td align="justify" border="tt t" rowspan="2" thead="column row" width="91.0pt"><tabular vattach="middle">
                    <tr>
                      <td align="center">Visual feature</td>
                    </tr>
                  </tabular></td>
                <td align="center" border="tt t" colspan="2" thead="column"><tabular vattach="middle">
                    <tr>
                      <td align="center">SMIC</td>
                    </tr>
                    <tr>
                      <td align="center">(3-class)</td>
                    </tr>
                  </tabular></td>
                <td align="center" border="tt t" colspan="2" thead="column"><tabular vattach="middle">
                    <tr>
                      <td align="center">SMIC</td>
                    </tr>
                    <tr>
                      <td align="center">(5-class)</td>
                    </tr>
                  </tabular></td>
                <td align="center" border="tt t" colspan="2" thead="column"><tabular vattach="middle">
                    <tr>
                      <td align="center">CASME II</td>
                    </tr>
                    <tr>
                      <td align="center">(3-class)</td>
                    </tr>
                  </tabular></td>
                <td align="center" border="tt t" colspan="2" thead="column"><tabular vattach="middle">
                    <tr>
                      <td align="center">CASME II</td>
                    </tr>
                    <tr>
                      <td align="center">(5-class)</td>
                    </tr>
                  </tabular></td>
                <td align="center" border="tt t" colspan="2" thead="column"><tabular vattach="middle">
                    <tr>
                      <td align="center">SAMM</td>
                    </tr>
                    <tr>
                      <td align="center">(3-class)</td>
                    </tr>
                  </tabular></td>
                <td align="center" border="tt t" colspan="2" thead="column"><tabular vattach="middle">
                    <tr>
                      <td align="center">SAMM</td>
                    </tr>
                    <tr>
                      <td align="center">(5-class)</td>
                    </tr>
                  </tabular></td>
              </tr>
              <tr>
                <td align="center" border="t" thead="column">Acc</td>
                <td align="center" border="t" thead="column">F1</td>
                <td align="center" border="t" thead="column">Acc</td>
                <td align="center" border="t" thead="column">F1</td>
                <td align="center" border="t" thead="column">Acc</td>
                <td align="center" border="t" thead="column">F1</td>
                <td align="center" border="t" thead="column">Acc</td>
                <td align="center" border="t" thead="column">F1</td>
                <td align="center" border="t" thead="column">Acc</td>
                <td align="center" border="t" thead="column">F1</td>
                <td align="center" border="t" thead="column">Acc</td>
                <td align="center" border="t" thead="column">F1</td>
              </tr>
            </thead>
            <tbody>
              <tr>
                <td align="justify" border="t" thead="row" width="91.0pt">Optical flow</td>
                <td align="justify" border="t" width="17.6pt">0.731</td>
                <td align="justify" border="t" width="17.6pt">0.732</td>
                <td align="justify" border="t" width="17.6pt">0.593</td>
                <td align="justify" border="t" width="17.6pt">0.591</td>
                <td align="justify" border="t" width="17.6pt">0.834</td>
                <td align="justify" border="t" width="17.6pt">0.838</td>
                <td align="justify" border="t" width="17.6pt">0.769</td>
                <td align="justify" border="t" width="17.6pt">0.767</td>
                <td align="justify" border="t" width="17.6pt">0.788</td>
                <td align="justify" border="t" width="17.6pt">0.780</td>
                <td align="justify" border="t" width="17.6pt">0.665</td>
                <td align="justify" border="t" width="17.6pt">0.662</td>
              </tr>
              <tr>
                <td align="justify" thead="row" width="91.0pt">Frame difference</td>
                <td align="justify" width="17.6pt">0.530</td>
                <td align="justify" width="17.6pt">0.532</td>
                <td align="justify" width="17.6pt">0.421</td>
                <td align="justify" width="17.6pt">0.422</td>
                <td align="justify" width="17.6pt">0.623</td>
                <td align="justify" width="17.6pt">0.620</td>
                <td align="justify" width="17.6pt">0.579</td>
                <td align="justify" width="17.6pt">0.577</td>
                <td align="justify" width="17.6pt">0.599</td>
                <td align="justify" width="17.6pt">0.591</td>
                <td align="justify" width="17.6pt">0.503</td>
                <td align="justify" width="17.6pt">0.500</td>
              </tr>
              <tr>
                <td align="justify" thead="row" width="91.0pt">MPFNet-P (keep all)</td>
                <td align="justify" width="17.6pt"><text class="ltx_wrap" font="bold">0.787</text></td>
                <td align="justify" width="17.6pt"><text class="ltx_wrap" font="bold">0.787</text></td>
                <td align="justify" width="17.6pt"><text class="ltx_wrap" font="bold">0.649</text></td>
                <td align="justify" width="17.6pt"><text class="ltx_wrap" font="bold">0.631</text></td>
                <td align="justify" width="17.6pt"><text class="ltx_wrap" font="bold">0.897</text></td>
                <td align="justify" width="17.6pt"><text class="ltx_wrap" font="bold">0.898</text></td>
                <td align="justify" width="17.6pt"><text class="ltx_wrap" font="bold">0.820</text></td>
                <td align="justify" width="17.6pt"><text class="ltx_wrap" font="bold">0.819</text></td>
                <td align="justify" width="17.6pt"><text class="ltx_wrap" font="bold">0.850</text></td>
                <td align="justify" width="17.6pt"><text class="ltx_wrap" font="bold">0.856</text></td>
                <td align="justify" width="17.6pt"><text class="ltx_wrap" font="bold">0.704</text></td>
                <td align="justify" width="17.6pt"><text class="ltx_wrap" font="bold">0.695</text></td>
              </tr>
              <tr>
                <td align="justify" border="t" thead="row" width="91.0pt">Optical flow</td>
                <td align="justify" border="t" width="17.6pt">0.752</td>
                <td align="justify" border="t" width="17.6pt">0.750</td>
                <td align="justify" border="t" width="17.6pt">0.614</td>
                <td align="justify" border="t" width="17.6pt">0.615</td>
                <td align="justify" border="t" width="17.6pt">0.865</td>
                <td align="justify" border="t" width="17.6pt">0.866</td>
                <td align="justify" border="t" width="17.6pt">0.782</td>
                <td align="justify" border="t" width="17.6pt">0.783</td>
                <td align="justify" border="t" width="17.6pt">0.799</td>
                <td align="justify" border="t" width="17.6pt">0.800</td>
                <td align="justify" border="t" width="17.6pt">0.678</td>
                <td align="justify" border="t" width="17.6pt">0.680</td>
              </tr>
              <tr>
                <td align="justify" thead="row" width="91.0pt">Frame difference</td>
                <td align="justify" width="17.6pt">0.573</td>
                <td align="justify" width="17.6pt">0.571</td>
                <td align="justify" width="17.6pt">0.496</td>
                <td align="justify" width="17.6pt">0.495</td>
                <td align="justify" width="17.6pt">0.662</td>
                <td align="justify" width="17.6pt">0.664</td>
                <td align="justify" width="17.6pt">0.596</td>
                <td align="justify" width="17.6pt">0.597</td>
                <td align="justify" width="17.6pt">0.609</td>
                <td align="justify" width="17.6pt">0.607</td>
                <td align="justify" width="17.6pt">0.515</td>
                <td align="justify" width="17.6pt">0.514</td>
              </tr>
              <tr>
                <td align="justify" border="bb b" thead="row" width="91.0pt">MPFNet-C (keep all)</td>
                <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.811</text></td>
                <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.811</text></td>
                <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.663</text></td>
                <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.652</text></td>
                <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.924</text></td>
                <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.925</text></td>
                <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.835</text></td>
                <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.833</text></td>
                <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.857</text></td>
                <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.863</text></td>
                <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.719</text></td>
                <td align="justify" border="bb b" width="17.6pt"><text class="ltx_wrap" font="bold">0.721</text></td>
              </tr>
            </tbody>
          </tabular>
        </table>
        <para xml:id="S5.SS3.SSS1.p1">
          <p>We design multiple experimental conditions by progressively incorporating prior knowledge. These conditions include without prior learning (w/o PL), where the encoder is trained from scratch; prior learning based on triplet network (PLTN); prior learning based on sample-balanced motion-amplified MEs (PLSM); and the fusion of both prior learning methods within the MPFNet-P or MPFNet-C framework. The experimental results are presented in Table <ref labelref="LABEL:tab:ablation-prior"/>. It is evident that the model performance significantly improves with the gradual incorporation of prior knowledge, especially with the MPFNet model that adopts a multi-prior fusion strategy, which achieves the best performance. Furthermore, MPFNet-C outperforms MPFNet-P across all evaluation metrics, indicating that the cascade feature encoder structure is more effective for MER tasks than the parallel feature encoder structure.
<!--  %****␣manuscript.tex␣Line␣1175␣**** 
     %我们采用逐步增加先验知识的方式，设计了多种实验策略，包括：无先验学习（即从零开始训练编码器）、基于␣Triplet␣Network␣的先验学习、基于样本增强的先验学习，以及在␣MPFNet-P␣或␣MPFNet-C␣框架下融合两种先验知识的学习策略。实验结果如表IV所示。可以看出，随着先验知识的逐步引入，模型性能得到了显著提升，尤其是采用多先验融合策略的␣MPFNet，表现出了最佳的性能。此外，MPFNet-C␣在所有评估指标中均优于␣MPFNet-P，这表明级联特征编码器结构相较于并行特征编码器结构在微表情识别任务中更为有效。--></p>
        </para>
      </subsubsection>
      <subsubsection inlist="toc" xml:id="S5.SS3.SSS2">
        <tags>
          <tag>5.3.2</tag>
          <tag role="autoref">subsubsection 5.3.2</tag>
          <tag role="refnum">5.3.2</tag>
          <tag role="typerefnum">§5.3.2</tag>
        </tags>
        <title><tag close=" ">5.3.2</tag>The effect of visual features</title>
        <para xml:id="S5.SS3.SSS2.p1">
          <p>Optical flow features serve as a crucial motion representation method, effectively capturing pixel-level motion information between video frames. Consequently, they have been widely applied in MER. Meanwhile, frame difference features, which quantify pixel intensity variations between consecutive frames, provide complementary visual cues. This study combines both features to form a comprehensive visual representation, as described in Section <ref labelref="LABEL:method"/>. Few studies have evaluated the relative contributions of these two visual features in MER tasks. To fill this gap, we conducted an ablation experiment to identify the dominant feature. We performed extensive experiments on both the MPFNet-P and MPFNet-C architectures across three feature configurations: (i) optical flow features only, (ii) frame difference features only, and (iii) integrated features combining both modalities. As demonstrated in Table <ref labelref="LABEL:table:ablation-visual"/>, our experimental results reveal two key findings: First, optical flow features consistently outperform frame difference features across both model architectures. Second, the feature fusion strategy yields significant performance improvements over single-feature approaches. Specifically, for the three-class classification task on the CASME II dataset, MPFNet-C achieves an accuracy of 0.865 with optical flow features alone, compared to 0.662 using only frame difference features. Notably, the integrated feature approach boosts accuracy to 0.924, demonstrating the complementary nature of these feature modalities. These results confirm the dominant role of optical flow features in MER tasks, while also demonstrating that incorporating frame difference features effectively improves the comprehensive representation capability of visual features. This study provides empirical evidence for understanding the contributions of different visual features in MER.
<!--  %光流特征是一种重要的运动表示方法，可有效捕获视频帧之间的像素级运动信息。因此，它们在␣MER␣中得到了广泛的应用。同时，量化连续帧之间的像素强度变化的帧差分功能提供了互补的视觉提示。这项研究结合了这两个特征，形成了一个全面的视觉表示，如\ref{method}章所述。很少有研究评估这两个视觉特征在␣MER␣任务中的相对贡献。为了填补这一空白，我们进行了消融实验以确定主要特征。我们对␣MPFNet-P␣和␣MPFNet-C␣架构进行了广泛的实验，涉及三种特征配置：（i）␣仅光流特征，（ii）␣仅帧差特征，以及␣（iii）␣结合两种模式的集成特征。如表␣\ref{table：ablation-visual␣}␣所示，我们的实验结果揭示了两个关键发现：首先，在两种模型架构中，光流特征始终优于帧差特征。其次，与单特征方法相比，特征融合策略产生了显著的性能改进。具体来说，对于␣CASME␣II␣数据集上的三类分类任务，MPFNet-C␣仅使用光流特征的准确率为␣0.865，而仅使用帧差特征的准确率为␣0.662。值得注意的是，集成特征方法将精度提高到␣0.924，证明了这些特征模态的互补性。这些结果证实了光流特征在␣MER␣任务中的主导作用，同时也证明了加入帧差分特征可以有效提高视觉特征的综合表示能力。本研究为理解␣MER␣中不同视觉特征的贡献提供了经验证据。 --></p>
        </para>
      </subsubsection>
    </subsection>
    <subsection inlist="toc" xml:id="S5.SS4">
      <tags>
        <tag>5.4</tag>
        <tag role="autoref">subsection 5.4</tag>
        <tag role="refnum">5.4</tag>
        <tag role="typerefnum">§5.4</tag>
      </tags>
      <title><tag close=" ">5.4</tag><text font="italic">Hyperparameter settings</text></title>
      <para xml:id="S5.SS4.p1">
        <p>In this study, we introduce two critical hyperparameters: the length of the ME frame sequence (<Math mode="inline" tex="L" text="L" xml:id="S5.SS4.p1.m1">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">L</XMTok>
            </XMath>
          </Math>) after interpolation and the distance-weighting factor (<Math mode="inline" tex="\gamma" text="gamma" xml:id="S5.SS4.p1.m2">
            <XMath>
              <XMTok font="italic" name="gamma" role="UNKNOWN">γ</XMTok>
            </XMath>
          </Math>) in MPFNet-P. To identify their optimal values, we conducted a series of experiments, varying <Math mode="inline" tex="L" text="L" xml:id="S5.SS4.p1.m3">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">L</XMTok>
            </XMath>
          </Math> within the range <Math mode="inline" tex="\{3,4,\dots,20\}" text="set@(3, 4, dots, 20)" xml:id="S5.SS4.p1.m4">
            <XMath>
              <XMDual>
                <XMApp>
                  <XMTok meaning="set"/>
                  <XMRef idref="S5.SS4.p1.m4.1"/>
                  <XMRef idref="S5.SS4.p1.m4.2"/>
                  <XMRef idref="S5.SS4.p1.m4.3"/>
                  <XMRef idref="S5.SS4.p1.m4.4"/>
                </XMApp>
                <XMWrap>
                  <XMTok role="OPEN" stretchy="false">{</XMTok>
                  <XMTok meaning="3" role="NUMBER" xml:id="S5.SS4.p1.m4.1">3</XMTok>
                  <XMTok role="PUNCT">,</XMTok>
                  <XMTok meaning="4" role="NUMBER" xml:id="S5.SS4.p1.m4.2">4</XMTok>
                  <XMTok role="PUNCT">,</XMTok>
                  <XMTok name="dots" role="ID" xml:id="S5.SS4.p1.m4.3">…</XMTok>
                  <XMTok role="PUNCT">,</XMTok>
                  <XMTok meaning="20" role="NUMBER" xml:id="S5.SS4.p1.m4.4">20</XMTok>
                  <XMTok role="CLOSE" stretchy="false">}</XMTok>
                </XMWrap>
              </XMDual>
            </XMath>
          </Math> and <Math mode="inline" tex="\gamma" text="gamma" xml:id="S5.SS4.p1.m5">
            <XMath>
              <XMTok font="italic" name="gamma" role="UNKNOWN">γ</XMTok>
            </XMath>
          </Math> within <Math mode="inline" tex="\{0.0,0.1,\dots,1.0\}" text="set@(0.0, 0.1, dots, 1.0)" xml:id="S5.SS4.p1.m6">
            <XMath>
              <XMDual>
                <XMApp>
                  <XMTok meaning="set"/>
                  <XMRef idref="S5.SS4.p1.m6.1"/>
                  <XMRef idref="S5.SS4.p1.m6.2"/>
                  <XMRef idref="S5.SS4.p1.m6.3"/>
                  <XMRef idref="S5.SS4.p1.m6.4"/>
                </XMApp>
                <XMWrap>
                  <XMTok role="OPEN" stretchy="false">{</XMTok>
                  <XMTok meaning="0.0" role="NUMBER" xml:id="S5.SS4.p1.m6.1">0.0</XMTok>
                  <XMTok role="PUNCT">,</XMTok>
                  <XMTok meaning="0.1" role="NUMBER" xml:id="S5.SS4.p1.m6.2">0.1</XMTok>
                  <XMTok role="PUNCT">,</XMTok>
                  <XMTok name="dots" role="ID" xml:id="S5.SS4.p1.m6.3">…</XMTok>
                  <XMTok role="PUNCT">,</XMTok>
                  <XMTok meaning="1.0" role="NUMBER" xml:id="S5.SS4.p1.m6.4">1.0</XMTok>
                  <XMTok role="CLOSE" stretchy="false">}</XMTok>
                </XMWrap>
              </XMDual>
            </XMath>
          </Math>. The results, presented in Fig. <ref labelref="LABEL:figs:hyperparameter"/>, demonstrate that the three-class classification accuracy peaks and stabilizes when <Math mode="inline" tex="L" text="L" xml:id="S5.SS4.p1.m7">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">L</XMTok>
            </XMath>
          </Math> ranges between 11 and 13 across all datasets. However, further increasing <Math mode="inline" tex="L" text="L" xml:id="S5.SS4.p1.m8">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">L</XMTok>
            </XMath>
          </Math> results in accuracy fluctuations or declines, likely due to information redundancy, noise accumulation, and heightened computational complexity, which collectively degrade classification performance. To achieve a balance between model performance and computational efficiency, we set <Math mode="inline" tex="L=11" text="L = 11" xml:id="S5.SS4.p1.m9">
            <XMath>
              <XMApp>
                <XMTok meaning="equals" role="RELOP">=</XMTok>
                <XMTok font="italic" role="UNKNOWN">L</XMTok>
                <XMTok meaning="11" role="NUMBER">11</XMTok>
              </XMApp>
            </XMath>
          </Math>. As for <Math mode="inline" tex="\gamma" text="gamma" xml:id="S5.SS4.p1.m10">
            <XMath>
              <XMTok font="italic" name="gamma" role="UNKNOWN">γ</XMTok>
            </XMath>
          </Math>, the optimal values for MPFNet-P were found to be 0.8, 0.7, 0.6, and 0.7 on the SMIC, CASME II, SAMM, and MEGC2019-CD datasets, respectively. These findings suggest that the AFE encoder plays a predominant role in feature representation learning within the embedding space.
<!--  %在本研究中，我们使用了两个重要的超参数：帧插值后的微表情帧序列的长度L和MPFNet-P中的距离加权权重γ。为了确定这些超参数的最优取值，本节开展了一系列实验，将L的取值范围设置为L∈{3,4,...,20}，将γ的取值范围设置为γ∈{0.0,0.1,...,1.0}，结果如图1所示。可以发现，当L在11~13范围内，这些数据集上的三分类准确率达到最高且趋于稳定。继续增加L时，准确率出现波动或下降。原因是继续增加L，会带来信息冗余、噪声累积、计算复杂度上升等因素，进而影响模型的分类准确率。在本研究中，我们设置L=11，这是模型性能和计算效率平衡的结果。对于γ，我们发现MPFNet-P在SMIC、CASME␣II、SAMM和MEGC2019-CD数据集上的最优γ分别为0.8、0.7、0.6和0.7，这说明在嵌入空间中，AFE编码器起着主导作用。 --></p>
      </para>
      <figure inlist="lof" labels="LABEL:figs:hyperparameter" placement="t" xml:id="S5.F8">
        <tags>
          <tag>Fig. 8</tag>
          <tag role="autoref">Figure 8</tag>
          <tag role="refnum">8</tag>
          <tag role="typerefnum">Fig. 8</tag>
        </tags>
        <graphics candidates="hyperparameter.pdf" class="ltx_centering" graphic="hyperparameter.pdf" options="width=368.577pt" xml:id="S5.F8.g1"/>
        <toccaption class="ltx_centering"><tag close=" ">8</tag>Impact of hyperparameters on MER: Frame sequence length <Math mode="inline" tex="L" text="L" xml:id="S5.F8.m1">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">L</XMTok>
            </XMath>
          </Math> (Top) and weighting factor <Math mode="inline" tex="\gamma" text="gamma" xml:id="S5.F8.m2">
            <XMath>
              <XMTok font="italic" name="gamma" role="UNKNOWN">γ</XMTok>
            </XMath>
          </Math> (Bottom).</toccaption>
        <caption class="ltx_centering"><tag close=": ">Fig. 8</tag>Impact of hyperparameters on MER: Frame sequence length <Math mode="inline" tex="L" text="L" xml:id="S5.F8.m3">
            <XMath>
              <XMTok font="italic" role="UNKNOWN">L</XMTok>
            </XMath>
          </Math> (Top) and weighting factor <Math mode="inline" tex="\gamma" text="gamma" xml:id="S5.F8.m4">
            <XMath>
              <XMTok font="italic" name="gamma" role="UNKNOWN">γ</XMTok>
            </XMath>
          </Math> (Bottom).</caption>
      </figure>
    </subsection>
    <subsection inlist="toc" labels="LABEL:visualization_and_analysis" xml:id="S5.SS5">
      <tags>
        <tag>5.5</tag>
        <tag role="autoref">subsection 5.5</tag>
        <tag role="refnum">5.5</tag>
        <tag role="typerefnum">§5.5</tag>
      </tags>
      <title><tag close=" ">5.5</tag><text font="italic">Visual analysis</text></title>
      <para xml:id="S5.SS5.p1">
        <p><text font="bold">Visualization of confusion matrices.</text> To gain further insight into the proposed method, we visualize the confusion matrices for different prior learning strategies across four datasets, as shown in Fig. <ref labelref="LABEL:figs:confusion_matrices"/>. The diagonal elements represent the proportion of correctly classified MEs in the test set, with darker colors indicating higher accuracy. It is evident that our model, when trained from scratch without prior knowledge, exhibits poor classification accuracy with significant variation across different categories. As prior knowledge is gradually introduced, we observe a substantial improvement in MPFNet’s performance in recognizing positive, negative, and surprise expressions. Notably, the negative class contains the most samples across the three datasets, particularly in the CASME II and SAMM datasets. Many existing algorithms achieve high classification accuracy for this dominant category, often at the expense of reduced accuracy for the other two categories. The MPFNet proposed in this paper significantly improves the accuracy of the two secondary categories, achieving a more balanced accuracy distribution across all categories.
For instance, without the integration of prior knowledge, the standard deviations of accuracy for the three emotions in the SMIC, CASME II, SAMM, and MEGC2019-CD datasets are 0.079, 0.073, 0.101, and 0.090, respectively. After incorporating multiple sources of prior knowledge, these standard deviations are reduced to 0.022, 0.044, 0.035, and 0.035 on MPFNet-C. These results demonstrate that the multi-prior learning strategy designed in this study mitigates the impact of few-shot and imbalance issues on the accuracy of MER.
<!--  %\␣textbf␣{混淆矩阵的可视化。}为了进一步了解所提出的方法，我们在四个数据集中可视化了不同先验学习策略的混淆矩阵，如图␣2␣所示。\ref{figs：confusion_matrices}。对角线元素表示测试集中正确分类的␣ME␣的比例，颜色越深表示准确性越高。很明显，我们的模型在没有先验知识的情况下从头开始训练时，分类准确性很差，不同类别之间存在显着差异。随着先验知识的逐渐引入，我们观察到␣MPFNet␣在识别积极、消极和惊喜表达方面的表现有了显着提高。值得注意的是，负类包含三个数据集中的样本最多，尤其是在␣CASME␣II␣和␣SAMM␣数据集中。许多现有算法为这个主要类别实现了高分类准确率，而其他两个类别通常以降低准确率为代价。本文提出的␣MPFNet␣显著提高了两个次要类别的准确性，在所有类别之间实现了更平衡的准确性分布。例如，在没有整合先验知识的情况下，SMIC、CASME␣II、SAMM␣和␣MEGC2019-CD␣数据集中三种情绪的准确性标准差分别为␣0.079、0.073、0.101␣和␣0.090。在整合了多个先验知识来源后，这些标准差在␣MPFNet-C␣上减少到␣0.022、0.044、0.035␣和␣0.035。这些结果表明，本研究中设计的多先验学习策略减轻了few-shot和不平衡问题对␣MER␣准确性的影响。 
     %****␣manuscript.tex␣Line␣1200␣****--></p>
      </para>
      <figure inlist="lof" labels="LABEL:figs:confusion_matrices" placement="t" xml:id="S5.F9">
        <tags>
          <tag>Fig. 9</tag>
          <tag role="autoref">Figure 9</tag>
          <tag role="refnum">9</tag>
          <tag role="typerefnum">Fig. 9</tag>
        </tags>
        <graphics candidates="Confusion_matrices.pdf" class="ltx_centering" graphic="Confusion_matrices.pdf" options="width=303.534pt" xml:id="S5.F9.g1"/>
        <toccaption class="ltx_centering"><tag close=" ">9</tag>The confusion matrices for MER with different prior learning strategy on SMIC, CASME II, SAMM and the MEGC2019-CD datasets. The terms w/o PL, PLTN, PLSM, and MPFNet refer to four distinct prior learning strategies. N, P, and S stand for negative, positive, and surprise respectively.</toccaption>
        <caption class="ltx_centering"><tag close=": ">Fig. 9</tag>The confusion matrices for MER with different prior learning strategy on SMIC, CASME II, SAMM and the MEGC2019-CD datasets. The terms w/o PL, PLTN, PLSM, and MPFNet refer to four distinct prior learning strategies. N, P, and S stand for negative, positive, and surprise respectively.</caption>
      </figure>
      <para xml:id="S5.SS5.p2">
        <p><text font="bold">Visualization of feature distribution.</text> We utilize the t-SNE method to project the feature distribution of the deep model into a two-dimensional space, visualizing it as a scatter plot. As shown in Fig. <ref labelref="LABEL:figs:tSNE"/>, the feature space extracted by the model without prior knowledge exhibits significant overlap, with samples from all three categories blending together and becoming indistinguishable. In contrast, when PLTN and PLSM are applied, the boundaries between categories become progressively wider and more distinct. After incorporating both types of prior knowledge, MPFNet-C learns more compact intra-class features, while the inter-class features for negative, positive, and surprise samples form tighter clusters with clearer boundaries, making them easier to separate. This demonstrates that our model extracts more discriminative features, resulting in tighter clusters that enhance the MER capability.
<!--  %\␣textbf␣{特征分布的可视化。}我们采用T-SNE方法将深层模型的特征分布映射到二维空间中，并将其视为散点图。如图7所示，该模型没有先验知识提取的特征空间表现出显着的重叠，来自三个类别的样本混合在一起，无法区分。相反，当使用PLTN和PLSM时，不同类别之间的边界变得越来越宽，更独特。在整合了两种类型的先验知识之后，MPFNET学习了更紧凑的阶层内特征，而否定的班级功能是负，正面和惊喜样本，形成具有清晰边界的更紧密的群集，使它们更易于分离。MPFNet-C这表明我们的模型可以提取更多的判别特征，从而形成更严格的簇以进一步提高MER功能。 --></p>
      </para>
      <figure inlist="lof" labels="LABEL:figs:tSNE" placement="t" xml:id="S5.F10">
        <tags>
          <tag>Fig. 10</tag>
          <tag role="autoref">Figure 10</tag>
          <tag role="refnum">10</tag>
          <tag role="typerefnum">Fig. 10</tag>
        </tags>
        <graphics candidates="tSNE.pdf" class="ltx_centering" graphic="tSNE.pdf" options="width=320.8788pt" xml:id="S5.F10.g1"/>
        <toccaption class="ltx_centering"><tag close=" ">10</tag>The t-SNE algorithm is utilized for visualizing deep features in the three-class classification task of negative, positive, and surprise expressions. The terms w/o PL, PLTN, and PLSM represent different prior learning strategies. As prior knowledge is progressively integrated, the boundaries between categories become increasingly distinct.</toccaption>
        <caption class="ltx_centering"><tag close=": ">Fig. 10</tag>The t-SNE algorithm is utilized for visualizing deep features in the three-class classification task of negative, positive, and surprise expressions. The terms w/o PL, PLTN, and PLSM represent different prior learning strategies. As prior knowledge is progressively integrated, the boundaries between categories become increasingly distinct.</caption>
      </figure>
      <para xml:id="S5.SS5.p3">
        <p><text font="bold">Visualization of feature heatmaps.</text> To gain a deeper understanding of the learned features, we visualize the activation heatmaps using Grad-CAM<note mark="3" role="footnote" xml:id="footnote3"><tags>
              <tag>3</tag>
              <tag role="autoref">footnote 3</tag>
              <tag role="refnum">3</tag>
              <tag role="typerefnum">footnote 3</tag>
            </tags><ref class="ltx_url" font="typewriter" href="https://github.com/jacobgil/pytorch-grad-cam">https://github.com/jacobgil/pytorch-grad-cam</ref></note>, as shown in Fig. <ref labelref="LABEL:figs:grad_cam"/>. This visualization illustrates the model’s capability to identify distinct regional distributions and visual features in images. Grad-CAM generates localization maps highlighting regions activated during facial feature extraction. We select one sample from each of the five emotional categories, apply Grad-CAM after the final convolutional layer of the model, and superimpose the resulting heatmap onto the original sample image. Initially, the model without prior learning focuses on regions unrelated to MEs, negatively affecting its performance. After incorporating the multi-prior learning strategy, the highlighted regions gradually converge towards key facial areas—such as the eyebrows and corners of the mouth—that are critical for detecting subtle MEs. Specifically, for the “happiness” sample, the Grad-CAM heatmap highlights the zygomaticus major muscle, corresponding to AU12, with the action descriptor “Lip corner puller.” For the “surprise” sample, the highlighted regions include the frontalis (pars lateralis) and masseter muscles, corresponding to AU2 (“Outer brow raiser”) and AU26 (“Jaw drop”), respectively. For the “anger” sample, the heatmap highlights the corrugator supercilii and orbicularis oculi muscles, consistent with AU4 (“Brow lowerer”) and AU7 (“Lid tightener”), respectively. For the “sadness” sample, the highlighted region corresponds to the frontalis (pars medialis), associated with AU1 (“Inner brow raiser”). For the “contempt” sample, the zygomaticus major and zygomaticus minor muscles are highlighted, corresponding to AU12 ((“Lip corner puller”) and AU14 (“Dimpler”), respectively. These visualized heatmaps provide strong evidence of the model’s effectiveness.
It should be noted that the MPFNet-P model employs a parallel fusion architecture design for feature encoders. This unique architectural approach renders both t-SNE and Grad-CAM techniques inapplicable to this model. Consequently, the visualization results pertaining to the MPFNet-P model are not included in Fig. <ref labelref="LABEL:figs:tSNE"/> and <ref labelref="LABEL:figs:grad_cam"/>.
<!--  %需要说明的是，由于MPFNet-P是对两种深度特征的并联操作，t-SNE和␣Grad-CAM对其不适用，因此图9和图10中没有该模型。 --></p>
      </para>
<!--  %\subsection{Discussion} 
     %Many␣studies␣first␣pre-trained␣the␣proposed␣model␣on␣MaE␣datasets␣and␣then␣fine-tune␣it␣on␣ME␣dataset␣to␣accomplish␣the␣MER␣task.␣However,␣MaEs␣exhibit␣higher␣intensity␣and␣visibility␣compared␣to␣MEs,␣while␣MEs␣only␣appear␣in␣small␣areas␣of␣the␣face.␣This␣significant␣domain␣discrepancy␣has␣led␣to␣some␣transfer␣learning␣methods␣failing␣to␣achieve␣the␣desired␣results.
     %****␣manuscript.tex␣Line␣1225␣****
     %It␣is␣noteworthy␣that,␣although␣the␣number␣of␣minority␣class␣samples␣(e.g.,␣sadness)␣in␣the␣ME␣dataset␣is␣typically␣very␣small,␣the␣number␣of␣triplets␣containing␣at␣least␣one␣minority␣class␣sample␣is␣significantly␣large.␣This␣approach␣allows␣the␣number␣of␣training␣samples␣for␣minority␣classes␣to␣increase␣exponentially.␣Furthermore,␣by␣sampling␣a␣similar␣number␣of␣triplets␣for␣each␣class␣during␣model␣training,␣class␣balance␣can␣be␣easily␣achieved.
     %The␣advantage␣of␣this␣approach␣is␣that␣the␣discrepancy␣in␣feature␣distribution␣between␣the␣amplified␣and␣original␣ME␣domains␣is␣less␣pronounced␣than␣that␣between␣the␣MaE␣and␣ME␣domains.␣Additionally,␣motion␣amplification␣enhances␣the␣subtle␣facial␣expressions␣more␣discernible.-->      <figure inlist="lof" labels="LABEL:figs:grad_cam" placement="t" xml:id="S5.F11">
        <tags>
          <tag>Fig. 11</tag>
          <tag role="autoref">Figure 11</tag>
          <tag role="refnum">11</tag>
          <tag role="typerefnum">Fig. 11</tag>
        </tags>
        <graphics candidates="Grad_cam.pdf" class="ltx_centering" graphic="Grad_cam.pdf" options="width=390.258pt" xml:id="S5.F11.g1"/>
        <toccaption class="ltx_centering"><tag close=" ">11</tag>Visual explanations of ME via gradient-based localization. We select a sample from each of the five emotional categories, apply the Grad-CAM method after the last convolutional layer of the model, and then superimpose the generated heatmap on the original sample image.</toccaption>
        <caption class="ltx_centering"><tag close=": ">Fig. 11</tag>Visual explanations of ME via gradient-based localization. We select a sample from each of the five emotional categories, apply the Grad-CAM method after the last convolutional layer of the model, and then superimpose the generated heatmap on the original sample image.</caption>
      </figure>
    </subsection>
  </section>
  <section inlist="toc" labels="LABEL:conclusion" xml:id="S6">
    <tags>
      <tag>6</tag>
      <tag role="autoref">section 6</tag>
      <tag role="refnum">6</tag>
      <tag role="typerefnum">§6</tag>
    </tags>
    <title><tag close=" ">6</tag><text font="smallcaps">Conclusion</text></title>
    <para xml:id="S6.p1">
      <p>This paper proposes a multi-prior fusion network (MPFNet), offering an innovative approach to effectively utilize scarce ME data and address class imbalance issues. First, we design a prior learning strategy based on a triplet network to train the model for encoding general ME features. To overcome the limitations of ME samples in transfer learning, we construct a sample-balanced and motion-amplified ME dataset to further train the model and extract more advanced ME features. Both feature encoders adopt the CA-I3D model as the backbone, enabling the efficient learning of crucial spatiotemporal and channel features. Furthermore, we designed two model variants, MPFNet-P and MPFNet-C, to evaluate the impact of different prior knowledge integration strategies on MER. Experimental results demonstrate that the proposed method not only improves the overall classification accuracy of ME recognition but also ensures balanced performance across different categories. Future research will focus on multimodal ME datasets, such as CAS(ME)<Math mode="inline" tex="{}^{3}" text="^3" xml:id="S6.p1.m1">
          <XMath>
            <XMApp role="FLOATSUPERSCRIPT" scriptpos="1">
              <XMTok fontsize="70%" meaning="3" role="NUMBER">3</XMTok>
            </XMApp>
          </XMath>
        </Math> <cite class="ltx_citemacro_cite">[<bibref bibrefs="li2022cas" separator="," yyseparator=","/>]</cite>, to further explore multimodal ME features and develop more efficient fusion strategies. The ultimate goal is to achieve a more robust and generalized MER framework.
<!--  %本文提出了一个多优先融合网络（MPFNET），提供了一种创新的方法来有效利用我的数据并解决类别不平衡问题。首先，我们设计了一种基于三胞胎网络的先前学习策略，以训练该模型编码General␣ME功能。为了克服ME样本在转移学习中的局限性，我们构建了一个样本平衡和运动放大的ME数据集，以进一步训练模型并提取更高级的ME功能。两种功能编码器都采用CA-I3D模型作为主链，从而有效地学习了关键时空和通道特征。此外，我们设计了两个模型变体MPFNET-P和MPFNET-C，以评估不同先验知识集成策略对微表达识别（MER）的影响。实验结果表明，所提出的方法不仅提高了我识别的整体分类准确性，而且还确保了不同类别的平衡性能。未来的研究将重点关注多模式ME数据集，例如CAS（ME）$^3␣$␣\␣cite␣{li2022cas}，以进一步探索多模式ME功能并制定更有效的融合策略。最终目标是实现一个更健壮，更广泛的MER框架。 --></p>
    </para>
  </section>
  <section inlist="toc" labels="LABEL:Ethical" xml:id="S7">
    <tags>
      <tag>7</tag>
      <tag role="autoref">section 7</tag>
      <tag role="refnum">7</tag>
      <tag role="typerefnum">§7</tag>
    </tags>
    <title><tag close=" ">7</tag><text font="smallcaps">Ethical impact statement</text></title>
    <para xml:id="S7.p1">
      <p>Privacy and data protection are paramount in ME research. ME data may contain sensitive biometric information, and deep learning models could potentially identify specific patterns from such data. Therefore, it is crucial to safeguard both the original data and the learned patterns. This requires secure model storage and the implementation of robust privacy-preserving techniques to prevent sensitive information leakage. The public ME dataset used in this study was collected with informed consent from participants, covering aspects such as data collection, processing, and sharing. Additionally, the optical flow and frame-difference extraction methods applied in this study effectively eliminate sensitive information, such as appearance and gender, while preserving the facial motion characteristics essential for ME analysis. This approach ensures the ethical development and deployment of MER systems.
<!--  %隐私和数据保护在␣ME␣研究中至关重要。ME␣数据可能包含敏感的生物识别信息，深度学习模型可能会从此类数据中识别特定模式。因此，保护原始数据和学习的模式至关重要。这需要安全的模型存储和强大的隐私保护技术实施，以防止敏感信息泄露。本研究中使用的公共␣ME␣数据集是在参与者知情同意的情况下收集的，涵盖数据收集、处理和共享等方面。此外，本研究中应用的光流和帧差提取方法有效地消除了敏感信息，例如外貌和性别，同时保留了␣ME␣分析所必需的面部运动特征。这种方法确保了␣MER␣系统的合乎道德的开发和部署。 --></p>
    </para>
<!--  %Furthermore,␣the␣experimental␣protocol␣adhered␣strictly␣to␣the␣guidelines␣and␣regulations␣approved␣by␣the␣Ethics␣Committee␣of␣Tianjin␣University␣(TJUE-2021-138). 
     %****␣manuscript.tex␣Line␣1250␣****-->  </section>
  <section xml:id="Sx1">
    <title>Acknowledgments</title>
    <para xml:id="Sx1.p1">
      <p>This work was supported in part by the grants from the National Natural Science Foundation of China under Grant (No.62332019, No.62076250, and No.62406338), the National Key Research and Development Program of China (No.2023YFF1203900 and No.2023YFF1203903).</p>
    </para>
<!--  %This␣work␣was␣supported␣in␣part␣by␣the␣grants␣from␣the␣National␣Natural␣Science␣Foundation␣of␣China␣under␣Grant␣(No.\py{62204204,␣No.62406338},␣No.62332019,␣and␣No.62076250),␣the␣National␣Key␣Research␣and␣Development␣Program␣of␣China␣(No.2023YFF1203900␣and␣No.2023YFF1203903). 
     %\py{This␣work␣was␣also␣supported␣by␣the␣Science␣and␣Technology␣Innovation␣2030-Major␣Project␣(No.2022ZD0208601␣and␣No.2022ZD0208600).}
     %\section*{Statement␣on␣Generative␣AI␣Technology}
     %The␣manuscript␣has␣undergone␣appropriate␣language␣refinement␣using␣an␣AI␣assistant␣(https://poe.com).
     %\section*{Data␣availability␣statement}
     %The␣data␣and␣codes␣used␣in␣this␣study␣can␣be␣obtained␣on␣the␣request␣from␣the␣corresponding␣author.-->  </section>
  <bibliography bibstyle="IEEEtran" citestyle="numbers" files="ref_supp" xml:id="bib">
    <title>References</title>
  </bibliography>
<!--  %that’s␣all␣folks --></document>
