Search | arXiv e-print repository

Approximate Spectral Decomposition of Fisher Information Matrix for Simple ReLU Networks

Authors: Yoshinari Takeishi, Masazumi Iida, Jun'ichi Takeuchi

Abstract: We argue the Fisher information matrix (FIM) of one hidden layer networks with the ReLU activation function. For a network, let $W$ denote the $d \times p$ weight matrix from the $d$-dimensional input to the hidden layer consisting of $p$ neurons, and $v$ the $p$-dimensional weight vector from the hidden layer to the scalar output. We focus on the FIM of $v$, which we denote as $I$. Under certain… ▽ More We argue the Fisher information matrix (FIM) of one hidden layer networks with the ReLU activation function. For a network, let $W$ denote the $d \times p$ weight matrix from the $d$-dimensional input to the hidden layer consisting of $p$ neurons, and $v$ the $p$-dimensional weight vector from the hidden layer to the scalar output. We focus on the FIM of $v$, which we denote as $I$. Under certain conditions, we characterize the first three clusters of eigenvalues and eigenvectors of the FIM. Specifically, we show that 1) Since $I$ is non-negative owing to the ReLU, the first eigenvalue is the Perron-Frobenius eigenvalue. 2) For the cluster of the next maximum values, the eigenspace is spanned by the row vectors of $W$. 3) The direct sum of the eigenspace of the first eigenvalue and that of the third cluster is spanned by the set of all the vectors obtained as the Hadamard product of any pair of the row vectors of $W$. We confirmed by numerical calculation that the above is approximately correct when the number of hidden nodes is about 10000. △ Less

Submitted 27 February, 2023; v1 submitted 30 November, 2021; originally announced November 2021.

arXiv:2106.08892 [pdf, ps, other]

Development of Quantized DNN Library for Exact Hardware Emulation

Authors: Masato Kiyama, Motoki Amagasaki, Masahiro Iida

Abstract: Quantization is used to speed up execution time and save power when runnning Deep neural networks (DNNs) on edge devices like AI chips. To investigate the effect of quantization, we need performing inference after quantizing the weights of DNN with 32-bit floating-point precision by a some bit width, and then quantizing them back to 32-bit floating-point precision. This is because the DNN library… ▽ More Quantization is used to speed up execution time and save power when runnning Deep neural networks (DNNs) on edge devices like AI chips. To investigate the effect of quantization, we need performing inference after quantizing the weights of DNN with 32-bit floating-point precision by a some bit width, and then quantizing them back to 32-bit floating-point precision. This is because the DNN library can only handle floating-point numbers. However, the accuracy of the emulation does not provide accurate precision. We need accurate precision to detect overflow in MAC operations or to verify the operation on edge de vices. We have developed PyParch, a DNN library that executes quantized DNNs (QNNs) with exactly the same be havior as hardware. In this paper, we describe a new proposal and implementation of PyParch. As a result of the evaluation, the accuracy of QNNs with arbitrary bit widths can be estimated for la rge and complex DNNs such as YOLOv5, and the overflow can be detected. We evaluated the overhead of the emulation time and found that it was 5.6 times slower for QNN and 42 times slower for QNN with overflow detection compared to the normal DNN execution time. △ Less

Submitted 15 June, 2021; originally announced June 2021.

arXiv:2105.08253 [pdf, other]

Finding a Needle in a Haystack: Tiny Flying Object Detection in 4K Videos using a Joint Detection-and-Tracking Approach

Authors: Ryota Yoshihashi, Rei Kawakami, Shaodi You, Tu Tuan Trinh, Makoto Iida, Takeshi Naemura

Abstract: Detecting tiny objects in a high-resolution video is challenging because the visual information is little and unreliable. Specifically, the challenge includes very low resolution of the objects, MPEG artifacts due to compression and a large searching area with many hard negatives. Tracking is equally difficult because of the unreliable appearance, and the unreliable motion estimation. Luckily, we… ▽ More Detecting tiny objects in a high-resolution video is challenging because the visual information is little and unreliable. Specifically, the challenge includes very low resolution of the objects, MPEG artifacts due to compression and a large searching area with many hard negatives. Tracking is equally difficult because of the unreliable appearance, and the unreliable motion estimation. Luckily, we found that by combining this two challenging tasks together, there will be mutual benefits. Following the idea, in this paper, we present a neural network model called the Recurrent Correlational Network, where detection and tracking are jointly performed over a multi-frame representation learned through a single, trainable, and end-to-end network. The framework exploits a convolutional long short-term memory network for learning informative appearance changes for detection, while the learned representation is shared in tracking for enhancing its performance. In experiments with datasets containing images of scenes with small flying objects, such as birds and unmanned aerial vehicles, the proposed method yielded consistent improvements in detection performance over deep single-frame detectors and existing motion-based detectors. Furthermore, our network performs as well as state-of-the-art generic object trackers when it was evaluated as a tracker on a bird image dataset. △ Less

Submitted 17 May, 2021; originally announced May 2021.

Comments: arXiv admin note: text overlap with arXiv:1709.04666

arXiv:1910.10431 [pdf, ps, other]

doi 10.1134/S0040577919090010

Statistical nature of Skyrme-Faddeev models in $2+1$ dimensions and normalizable fermions

Authors: Yuki Amari, Masaya Iida, Nobuyuki Sawado

Abstract: The Skyrme-Faddeev model has planar soliton solutions with target space $\mathbb{C}P^N$. An Abelian Chern-Simons term (the Hopf term) in the Lagrangian of the model plays a crucial role for the statistical properties of the solutions. Because $Πぱい_3(\mathbb{C}P^1)=\mathbb{Z}$, the term becomes an integer for $N=1$. On the other hand, for $N>1$, it becomes perturbative because $Πぱい_3(\mathbb{C}P^N)$ is… ▽ More The Skyrme-Faddeev model has planar soliton solutions with target space $\mathbb{C}P^N$. An Abelian Chern-Simons term (the Hopf term) in the Lagrangian of the model plays a crucial role for the statistical properties of the solutions. Because $Πぱい_3(\mathbb{C}P^1)=\mathbb{Z}$, the term becomes an integer for $N=1$. On the other hand, for $N>1$, it becomes perturbative because $Πぱい_3(\mathbb{C}P^N)$ is trivial. The prefactor $Θしーた$ of the Hopf term is not quantized, and its value depends on the physical system. We study the spectral flow of the normalizable fermions coupled with the baby-Skyrme model ($\mathbb{C}P^N$ Skyrme-Faddeev model). We discuss whether the statistical nature of solitons can be explained using their constituents, i.e., the quarks. △ Less

Submitted 23 October, 2019; originally announced October 2019.

Comments: 18 pages, 14 figures

Journal ref: Theoretical and Mathematical Physics, 200(3): 1253-1268 (2019)

arXiv:1908.05141 [pdf, other]

J-PARC Neutrino Beamline Upgrade Technical Design Report

Authors: K. Abe, H. Aihara, A. Ajmi, C. Alt, C. Andreopoulos, M. Antonova, S. Aoki, Y. Asada, Y. Ashida, A. Atherton, E. Atkin, S. Ban, F. C. T. Barbato, M. Barbi, G. J. Barker, G. Barr, M. Batkiewicz, A. Beloshapkin, V. Berardi, L. Berns, S. Bhadra, J. Bian, S. Bienstock, A. Blondel, S. Bolognesi , et al. (360 additional authors not shown)

Abstract: In this document, technical details of the upgrade plan of the J-PARC neutrino beamline for the extension of the T2K experiment are described. T2K has proposed to accumulate data corresponding to $2\times{}10^{22}$ protons-on-target in the next decade, aiming at an initial observation of CP violation with $3σしぐま$ or higher significance in the case of maximal CP violation. Methods to increase the neut… ▽ More In this document, technical details of the upgrade plan of the J-PARC neutrino beamline for the extension of the T2K experiment are described. T2K has proposed to accumulate data corresponding to $2\times{}10^{22}$ protons-on-target in the next decade, aiming at an initial observation of CP violation with $3σしぐま$ or higher significance in the case of maximal CP violation. Methods to increase the neutrino beam intensity, which are necessary to achieve the proposed data increase, are described. △ Less

Submitted 14 August, 2019; originally announced August 2019.

arXiv:1812.04246 [pdf, other]

Classification-Reconstruction Learning for Open-Set Recognition

Authors: Ryota Yoshihashi, Wen Shao, Rei Kawakami, Shaodi You, Makoto Iida, Takeshi Naemura

Abstract: Open-set classification is a problem of handling `unknown' classes that are not contained in the training dataset, whereas traditional classifiers assume that only known classes appear in the test environment. Existing open-set classifiers rely on deep networks trained in a supervised manner on known classes in the training set; this causes specialization of learned representations to known classe… ▽ More Open-set classification is a problem of handling `unknown' classes that are not contained in the training dataset, whereas traditional classifiers assume that only known classes appear in the test environment. Existing open-set classifiers rely on deep networks trained in a supervised manner on known classes in the training set; this causes specialization of learned representations to known classes and makes it hard to distinguish unknowns from knowns. In contrast, we train networks for joint classification and reconstruction of input data. This enhances the learned representation so as to preserve information useful for separating unknowns from knowns, as well as to discriminate classes of knowns. Our novel Classification-Reconstruction learning for Open-Set Recognition (CROSR) utilizes latent representations for reconstruction and enables robust unknown detection without harming the known-class classification accuracy. Extensive experiments reveal that the proposed method outperforms existing deep open-set classifiers in multiple standard datasets and is robust to diverse outliers. The code is available in https://nae-lab.org/~rei/research/crosr/. △ Less

Submitted 6 October, 2019; v1 submitted 11 December, 2018; originally announced December 2018.

Comments: 11 pages, 7 figures

arXiv:1805.09497 [pdf]

doi 10.1016/j.ecoenv.2018.05.065

A network biology-based approach to evaluating the effect of environmental contaminants on human interactome and diseases

Authors: Midori Iida, Kazuhiro Takemoto

Abstract: Environmental contaminant exposure can pose significant risks to human health. Therefore, evaluating the impact of this exposure is of great importance; however, it is often difficult because both the molecular mechanism of disease and the mode of action of the contaminants are complex. We used network biology techniques to quantitatively assess the impact of environmental contaminants on the huma… ▽ More Environmental contaminant exposure can pose significant risks to human health. Therefore, evaluating the impact of this exposure is of great importance; however, it is often difficult because both the molecular mechanism of disease and the mode of action of the contaminants are complex. We used network biology techniques to quantitatively assess the impact of environmental contaminants on the human interactome and diseases with a particular focus on seven major contaminant categories: persistent organic pollutants (POPs), dioxins, polycyclic aromatic hydrocarbons (PAHs), pesticides, perfluorochemicals (PFCs), metals, and pharmaceutical and personal care products (PPCPs). We integrated publicly available data on toxicogenomics, the diseasome, protein-protein interactions (PPIs), and gene essentiality and found that a few contaminants were targeted to many genes, and a few genes were targeted by many contaminants. The contaminant targets were hub proteins in the human PPI network, whereas the target proteins in most categories did not contain abundant essential proteins. Generally, contaminant targets and disease-associated proteins were closely associated with the PPI network, and the closeness of the associations depended on the disease type and chemical category. Network biology techniques were used to identify environmental contaminants with broad effects on the human interactome and contaminant-sensitive biomarkers. Moreover, this method enabled us to quantify the relationship between environmental contaminants and human diseases, which was supported by epidemiological and experimental evidence. These methods and findings have facilitated the elucidation of the complex relationship between environmental exposure and adverse health outcomes. △ Less

Submitted 23 May, 2018; originally announced May 2018.

Comments: 35 pages, 12 figures

Journal ref: Ecotoxicology and Environmental Safety 160, 316-327 (2018)

arXiv:1805.05569 [pdf, other]

Cross-connected Networks for Multi-task Learning of Detection and Segmentation

Authors: Seiichiro Fukuda, Ryota Yoshihashi, Rei Kawakami, Shaodi You, Makoto Iida, Takeshi Naemura

Abstract: Multi-task learning improves generalization performance by sharing knowledge among related tasks. Existing models are for task combinations annotated on the same dataset, while there are cases where multiple datasets are available for each task. How to utilize knowledge of successful single-task CNNs that are trained on each dataset has been explored less than multi-task learning with a single dat… ▽ More Multi-task learning improves generalization performance by sharing knowledge among related tasks. Existing models are for task combinations annotated on the same dataset, while there are cases where multiple datasets are available for each task. How to utilize knowledge of successful single-task CNNs that are trained on each dataset has been explored less than multi-task learning with a single dataset. We propose a cross-connected CNN, a new architecture that connects single-task CNNs through convolutional layers, which transfer useful information for the counterpart. We evaluated our proposed architecture on a combination of detection and segmentation using two datasets. Experiments on pedestrians show our CNN achieved a higher detection performance compared to baseline CNNs, while maintaining high quality for segmentation. It is the first known attempt to tackle multi-task learning with different training datasets between detection and segmentation. Experiments with wild birds demonstrate how our CNN learns general representations from limited datasets. △ Less

Submitted 15 May, 2018; originally announced May 2018.

arXiv:1709.04666 [pdf, other]

Differentiating Objects by Motion: Joint Detection and Tracking of Small Flying Objects

Authors: Ryota Yoshihashi, Tu Tuan Trinh, Rei Kawakami, Shaodi You, Makoto Iida, Takeshi Naemura

Abstract: While generic object detection has achieved large improvements with rich feature hierarchies from deep nets, detecting small objects with poor visual cues remains challenging. Motion cues from multiple frames may be more informative for detecting such hard-to-distinguish objects in each frame. However, how to encode discriminative motion patterns, such as deformations and pose changes that charact… ▽ More While generic object detection has achieved large improvements with rich feature hierarchies from deep nets, detecting small objects with poor visual cues remains challenging. Motion cues from multiple frames may be more informative for detecting such hard-to-distinguish objects in each frame. However, how to encode discriminative motion patterns, such as deformations and pose changes that characterize objects, has remained an open question. To learn them and thereby realize small object detection, we present a neural model called the Recurrent Correlational Network, where detection and tracking are jointly performed over a multi-frame representation learned through a single, trainable, and end-to-end network. A convolutional long short-term memory network is utilized for learning informative appearance change for detection, while learned representation is shared in tracking for enhancing its performance. In experiments with datasets containing images of scenes with small flying objects, such as birds and unmanned aerial vehicles, the proposed method yielded consistent improvements in detection performance over deep single-frame detectors and existing motion-based detectors. Furthermore, our network performs as well as state-of-the-art generic object trackers when it was evaluated as a tracker on the bird dataset. △ Less

Submitted 15 May, 2018; v1 submitted 14 September, 2017; originally announced September 2017.

Comments: 10 pages, 8 figures

arXiv:1609.03136 [pdf, other]

doi 10.1109/NOCS.2016.7579334

A Heuristic Method of Generating Diameter 3 Graphs for Order/Degree Problem

Authors: Teruaki Kitasuka, Masahiro Iida

Abstract: We propose a heuristic method that generates a graph for order/degree problem. Target graphs of our heuristics have large order (> 4000) and diameter 3. We describe the ob- servation of smaller graphs and basic structure of our heuristics. We also explain an evaluation function of each edge for efficient 2-opt local search. Using them, we found the best solutions for several graphs. We propose a heuristic method that generates a graph for order/degree problem. Target graphs of our heuristics have large order (> 4000) and diameter 3. We describe the ob- servation of smaller graphs and basic structure of our heuristics. We also explain an evaluation function of each edge for efficient 2-opt local search. Using them, we found the best solutions for several graphs. △ Less

Submitted 11 September, 2016; originally announced September 2016.

Comments: Proceedings of 10th IEEE/ACM International Symposium on Networks-on-Chip, Nara, Japan, Aug. 2016

arXiv:1201.1386 [pdf, other]

doi 10.1103/PhysRevD.85.031103

First Muon-Neutrino Disappearance Study with an Off-Axis Beam

Authors: T2K Collaboration, K. Abe, N. Abgrall, Y. Ajima, H. Aihara, J. B. Albert, C. Andreopoulos, B. Andrieu, M. D. Anerella, S. Aoki, O. Araoka, J. Argyriades, A. Ariga, T. Ariga, S. Assylbekov, D. Autiero, A. Badertscher, M. Barbi, G. J. Barker, G. Barr, M. Bass, M. Batkiewicz, F. Bay, S. Bentham, V. Berardi , et al. (422 additional authors not shown)

Abstract: We report a measurement of muon-neutrino disappearance in the T2K experiment. The 295-km muon-neutrino beam from Tokai to Kamioka is the first implementation of the off-axis technique in a long-baseline neutrino oscillation experiment. With data corresponding to 1.43 10**20 protons on target, we observe 31 fully-contained single muon-like ring events in Super-Kamiokande, compared with an expectati… ▽ More We report a measurement of muon-neutrino disappearance in the T2K experiment. The 295-km muon-neutrino beam from Tokai to Kamioka is the first implementation of the off-axis technique in a long-baseline neutrino oscillation experiment. With data corresponding to 1.43 10**20 protons on target, we observe 31 fully-contained single muon-like ring events in Super-Kamiokande, compared with an expectation of 104 +- 14 (syst) events without neutrino oscillations. The best-fit point for two-flavor nu_mu -> nu_tau oscillations is sin**2(2 theta_23) = 0.98 and |Δでるたm**2_32| = 2.65 10**-3 eV**2. The boundary of the 90 % confidence region includes the points (sin**2(2 theta_23),|Δでるたm**2_32|) = (1.0, 3.1 10**-3 eV**2), (0.84, 2.65 10**-3 eV**2) and (1.0, 2.2 10**-3 eV**2). △ Less

Submitted 6 January, 2012; originally announced January 2012.

Comments: 7 pages, 4 figures

Journal ref: Physical Review D 85, 031103(R) (2012)

arXiv:1111.3119 [pdf, other]

doi 10.1016/j.nima.2012.03.023

Measurements of the T2K neutrino beam properties using the INGRID on-axis near detector

Authors: K. Abe, N. Abgrall, Y. Ajima, H. Aihara, J. B. Albert, C. Andreopoulos, B. Andrieu, M. D. Anerella, S. Aoki, O. Araoka, J. Argyriades, A. Ariga, T. Ariga, S. Assylbekov, D. Autiero, A. Badertscher, M. Barbi, G. J. Barker, G. Barr, M. Bass, M. Batkiewicz, F. Bay, S. Bentham, V. Berardi, B. E. Berger , et al. (407 additional authors not shown)

Abstract: Precise measurement of neutrino beam direction and intensity was achieved based on a new concept with modularized neutrino detectors. INGRID (Interactive Neutrino GRID) is an on-axis near detector for the T2K long baseline neutrino oscillation experiment. INGRID consists of 16 identical modules arranged in horizontal and vertical arrays around the beam center. The module has a sandwich structure o… ▽ More Precise measurement of neutrino beam direction and intensity was achieved based on a new concept with modularized neutrino detectors. INGRID (Interactive Neutrino GRID) is an on-axis near detector for the T2K long baseline neutrino oscillation experiment. INGRID consists of 16 identical modules arranged in horizontal and vertical arrays around the beam center. The module has a sandwich structure of iron target plates and scintillator trackers. INGRID directly monitors the muon neutrino beam profile center and intensity using the number of observed neutrino events in each module. The neutrino beam direction is measured with accuracy better than 0.4 mrad from the measured profile center. The normalized event rate is measured with 4% precision. △ Less

Submitted 14 November, 2011; originally announced November 2011.

Comments: 32 pages, 27 figures, submitted to Nucl. Instr. and Meth. A

arXiv:1106.2822 [pdf, other]

doi 10.1103/PhysRevLett.107.041801

Indication of Electron Neutrino Appearance from an Accelerator-produced Off-axis Muon Neutrino Beam

Authors: T2K Collaboration, K. Abe, N. Abgrall, Y. Ajima, H. Aihara, J. B. Albert, C. Andreopoulos, B. Andrieu, S. Aoki, O. Araoka, J. Argyriades, A. Ariga, T. Ariga, S. Assylbekov, D. Autiero, A. Badertscher, M. Barbi, G. J. Barker, G. Barr, M. Bass, F. Bay, S. Bentham, V. Berardi, B. E. Berger, I. Bertram , et al. (387 additional authors not shown)

Abstract: The T2K experiment observes indications of $νにゅー_μみゅー\rightarrow νにゅー_e$ appearance in data accumulated with $1.43\times10^{20}$ protons on target. Six events pass all selection criteria at the far detector. In a three-flavor neutrino oscillation scenario with $|Δでるたm_{23}^2|=2.4\times10^{-3}$ eV$^2$, $\sin^2 2θしーた_{23}=1$ and $\sin^2 2θしーた_{13}=0$, the expected number of such events is 1.5$\pm$0.3(syst.). Under th… ▽ More The T2K experiment observes indications of $νにゅー_μみゅー\rightarrow νにゅー_e$ appearance in data accumulated with $1.43\times10^{20}$ protons on target. Six events pass all selection criteria at the far detector. In a three-flavor neutrino oscillation scenario with $|Δでるたm_{23}^2|=2.4\times10^{-3}$ eV$^2$, $\sin^2 2θしーた_{23}=1$ and $\sin^2 2θしーた_{13}=0$, the expected number of such events is 1.5$\pm$0.3(syst.). Under this hypothesis, the probability to observe six or more candidate events is 7$\times10^{-3}$, equivalent to 2.5$σしぐま$ significance. At 90% C.L., the data are consistent with 0.03(0.04)$<\sin^2 2θしーた_{13}<$ 0.28(0.34) for $δでるた_{\rm CP}=0$ and a normal (inverted) hierarchy. △ Less

Submitted 25 July, 2011; v1 submitted 14 June, 2011; originally announced June 2011.

Comments: 20 pages, 6 figures, version published in PRL

Journal ref: Phys.Rev.Lett.107:041801,2011

arXiv:1106.1238 [pdf, ps, other]

doi 10.1016/j.nima.2011.06.067

The T2K Experiment

Authors: T2K Collaboration, K. Abe, N. Abgrall, H. Aihara, Y. Ajima, J. B. Albert, D. Allan, P. -A. Amaudruz, C. Andreopoulos, B. Andrieu, M. D. Anerella, C. Angelsen, S. Aoki, O. Araoka, J. Argyriades, A. Ariga, T. Ariga, S. Assylbekov, J. P. A. M. de André, D. Autiero, A. Badertscher, O. Ballester, M. Barbi, G. J. Barker, P. Baron , et al. (499 additional authors not shown)

Abstract: The T2K experiment is a long-baseline neutrino oscillation experiment. Its main goal is to measure the last unknown lepton sector mixing angle θしーた_{13} by observing νにゅー_e appearance in a νにゅー_μみゅー beam. It also aims to make a precision measurement of the known oscillation parameters, Δでるたm^{2}_{23} and sin^{2} 2θしーた_{23}, via νにゅー_μみゅー disappearance studies. Other goals of the experiment include various neutrino cross… ▽ More The T2K experiment is a long-baseline neutrino oscillation experiment. Its main goal is to measure the last unknown lepton sector mixing angle θしーた_{13} by observing νにゅー_e appearance in a νにゅー_μみゅー beam. It also aims to make a precision measurement of the known oscillation parameters, Δでるたm^{2}_{23} and sin^{2} 2θしーた_{23}, via νにゅー_μみゅー disappearance studies. Other goals of the experiment include various neutrino cross section measurements and sterile neutrino searches. The experiment uses an intense proton beam generated by the J-PARC accelerator in Tokai, Japan, and is composed of a neutrino beamline, a near detector complex (ND280), and a far detector (Super-Kamiokande) located 295 km away from J-PARC. This paper provides a comprehensive review of the instrumentation aspect of the T2K experiment and a summary of the vital information for each subsystem. △ Less

Submitted 8 June, 2011; v1 submitted 6 June, 2011; originally announced June 2011.

Comments: 33 pages, 32 figures, Submitted and accepted by NIM A. Editor: Prof. Chang Kee Jung, Department of Physics and Astronomy, SUNY Stony Brook, chang.jung@sunysb.edu, 631-632-8108 Submit Edited to remove line numbers

arXiv:cmp-lg/9609007 [pdf, ps, other]

Discourse Coherence and Shifting Centers in Japanese Texts

Authors: Masayo Iida

Abstract: In languages such as Japanese, the use of {\it zeros}, unexpressed arguments of the verb, in utterances that shift the topic involves a risk that the meaning intended by the speaker may not be transparent to the hearer. However, this potentially undesirable conversational strategy often occurs in the course of naturally-occurring discourse. In this chapter, I report on an empirical study of 250… ▽ More In languages such as Japanese, the use of {\it zeros}, unexpressed arguments of the verb, in utterances that shift the topic involves a risk that the meaning intended by the speaker may not be transparent to the hearer. However, this potentially undesirable conversational strategy often occurs in the course of naturally-occurring discourse. In this chapter, I report on an empirical study of 250 utterances with {\it zeros} in 20 Japanese newspaper articles. Each utterance is analyzed in terms of centering transitions and the form in which centers are realized by referring expressions. I also examine lexical subcategorization information, and tense and aspect in order to test the hypothesis that the speaker expects the hearer to use this information in determining global discourse structure. I explain the occurrence of {\it zeros} in {\sc retain} and {\sc rough-shift} centering transitions, by claiming that a {\it zero} can only be used in these cases when the shift of centers is supported by contextual information such as lexical semantics, tense and aspect, and agreement features. I then propose an algorithm by which centering can incorporate these observations to integrate centering with global discourse structure, and thus enhance its ability for non-local pronoun resolution. △ Less

Submitted 24 September, 1996; originally announced September 1996.

Comments: 20 pages, uses elsart12.sty, lingmacros.sty, named.sty

Journal ref: Centering in Discourse, Oxford University Press; Eds. Walker, Joshi and Prince, In Press

arXiv:cmp-lg/9609006 [pdf, ps, other]

Japanese Discourse and the Process of Centering

Authors: Marilyn Walker, Masayo Iida, Sharon Cote

Abstract: This paper has three aims: (1) to generalize a computational account of the discourse process called {\sc centering}, (2) to apply this account to discourse processing in Japanese so that it can be used in computational systems for machine translation or language understanding, and (3) to provide some insights on the effect of syntactic factors in Japanese on discourse interpretation. We argue t… ▽ More This paper has three aims: (1) to generalize a computational account of the discourse process called {\sc centering}, (2) to apply this account to discourse processing in Japanese so that it can be used in computational systems for machine translation or language understanding, and (3) to provide some insights on the effect of syntactic factors in Japanese on discourse interpretation. We argue that while discourse interpretation is an inferential process, syntactic cues constrain this process, and demonstrate this argument with respect to the interpretation of {\sc zeros}, unexpressed arguments of the verb, in Japanese. The syntactic cues in Japanese discourse that we investigate are the morphological markers for grammatical {\sc topic}, the postposition {\it wa}, as well as those for grammatical functions such as {\sc subject}, {\em ga}, {\sc object}, {\em o} and {\sc object2}, {\em ni}. In addition, we investigate the role of speaker's {\sc empathy}, which is the viewpoint from which an event is described. This is syntactically indicated through the use of verbal compounding, i.e. the auxiliary use of verbs such as {\it kureta, kita}. Our results are based on a survey of native speakers of their interpretation of short discourses, consisting of minimal pairs, varied by one of the above factors. We demonstrate that these syntactic cues do indeed affect the interpretation of {\sc zeros}, but that having previously been the {\sc topic} and being realized as a {\sc zero} also contributes to the salience of a discourse entity. We propose a discourse rule of {\sc zero topic assignment}, and show that {\sc centering} provides constraints on when a {\sc zero} can be interpreted as the {\sc zero topic}. △ Less

Submitted 24 September, 1996; originally announced September 1996.

Comments: 38 pages, uses clstyle, lingmacros

Journal ref: Computational Linguistics 20-2, 1994

arXiv:cmp-lg/9609005 [pdf, ps, other]

Centering in Japanese Discourse

Authors: Marilyn Walker, Masayo Iida, Sharon Cote

Abstract: In this paper we propose a computational treatment of the resolution of zero pronouns in Japanese discourse, using an adaptation of the centering algorithm. We are able to factor language-specific dependencies into one parameter of the centering algorithm. Previous analyses have stipulated that a zero pronoun and its cospecifier must share a grammatical function property such as {\sc Subject} or… ▽ More In this paper we propose a computational treatment of the resolution of zero pronouns in Japanese discourse, using an adaptation of the centering algorithm. We are able to factor language-specific dependencies into one parameter of the centering algorithm. Previous analyses have stipulated that a zero pronoun and its cospecifier must share a grammatical function property such as {\sc Subject} or {\sc NonSubject}. We show that this property-sharing stipulation is unneeded. In addition we propose the notion of {\sc topic ambiguity} within the centering framework, which predicts some ambiguities that occur in Japanese discourse. This analysis has implications for the design of language-independent discourse modules for Natural Language systems. The centering algorithm has been implemented in an HPSG Natural Language system with both English and Japanese grammars. △ Less

Submitted 24 September, 1996; originally announced September 1996.

Comments: 7 pages, uses twocolumn

Journal ref: COLING90: Proceedings 13th International Conference on Computational Linguistics, Helsinki

arXiv:cmp-lg/9506009 [pdf, ps]

Filling Knowledge Gaps in a Broad-Coverage Machine Translation System

Authors: Kevin Knight, Ishwar Chander, Matthew Haines, Vasileios Hatzivassiloglou, Eduard Hovy, Masayo Iida, Steve K. Luk, Richard Whitney, Kenji Yamada

Abstract: Knowledge-based machine translation (KBMT) techniques yield high quality in domains with detailed semantic models, limited vocabulary, and controlled input grammar. Scaling up along these dimensions means acquiring large knowledge resources. It also means behaving reasonably when definitive knowledge is not yet available. This paper describes how we can fill various KBMT knowledge gaps, often us… ▽ More Knowledge-based machine translation (KBMT) techniques yield high quality in domains with detailed semantic models, limited vocabulary, and controlled input grammar. Scaling up along these dimensions means acquiring large knowledge resources. It also means behaving reasonably when definitive knowledge is not yet available. This paper describes how we can fill various KBMT knowledge gaps, often using robust statistical techniques. We describe quantitative and qualitative results from JAPANGLOSS, a broad-coverage Japanese-English MT system. △ Less

Submitted 9 June, 1995; originally announced June 1995.

Comments: 7 pages, Compressed and uuencoded postscript. To appear: IJCAI-95

arXiv:cmp-lg/9409001 [pdf, ps]

Integrating Knowledge Bases and Statistics in MT

Authors: Kevin Knight, Ishwar Chander, Matthew Haines, Vasileios Hatzivassiloglou, Eduard Hovy, Masayo Iida, Steve K. Luk, Akitoshi Okumura, Richard Whitney, Kenji Yamada

Abstract: We summarize recent machine translation (MT) research at the Information Sciences Institute of USC, and we describe its application to the development of a Japanese-English newspaper MT system. Our work aims at scaling up grammar-based, knowledge-based MT techniques. This scale-up involves the use of statistical methods, both in acquiring effective knowledge resources and in making reasonable li… ▽ More We summarize recent machine translation (MT) research at the Information Sciences Institute of USC, and we describe its application to the development of a Japanese-English newspaper MT system. Our work aims at scaling up grammar-based, knowledge-based MT techniques. This scale-up involves the use of statistical methods, both in acquiring effective knowledge resources and in making reasonable linguistic choices in the face of knowledge gaps. △ Less

Submitted 5 September, 1994; originally announced September 1994.

Comments: 8 pages, compressed, uuencoded postscript

Journal ref: Proc Association for Machine Translation in the Americas (AMTA-94)

Showing 1–19 of 19 results for author: Iida, M