# Keynotes

## CANDAR Keynote 1

- Chair: Masahiro Goshima (NII)
- Speaker:
**Akira Asato**(Fujitsu) - Title:
**Post-K Supercomputer with Fujitsu's Original CPU, Powered by ARM ISA** - Abstract: RIKEN and Fujitsu are currently developing the successor of the K computer, as Japanese national project. Post-K inherits the system architecture of K computer and targets up to 100 times higher application performance than K computer. We have chosen ARM v8 and SVE instruction set architecture for Post-K CPU to widen application opportunities and contribute to developing the ARM ecosystem for HPC applications, as well as to improve science and society. In this talk, the overview of Post-K supercomputer is presented with some ISA related topics.

## CANDAR Keynote 2

- Chair: Satoshi Fujita (Hiroshima University)
- Speaker:
**Ryuhei Uehara**(JAIST) - Title:
**Folding and unfolding algorithms on (super)computer** - Abstract: Computational origami has been attracted in computational geometry from the viewpoints of data structure and algorithms. Representative problem is to decide whether given polygon P can be folded to a given polyhedron Q. If Q is a regular tetrahedron, there is a beautiful mathematical theorem that characterizes P as a tiling. However, except that, the problem is quite difficult in general. One of the reasons is the problem is quite counterintuitive. For example, there exists a polygon P of area 22 that can fold to one box of size 1x1x5, and another box of size 1x2x3 just by changing folding lines. Our research group found a surprising polygon of area 30 that can fold to one box of size 1x1x7, another box of size 1x3x3, and the third cube of size sqrt{5}x sqrt{5}x sqrt{5}. Moreover, the last cube has two different way to fold to the cube. For finding such a polygon, we need to develop some new algorithms, and then we run our program on a supercomputer two month. We give other related results and open problems about folding problems.

## CANDAR Keynote 3

- Chair: Yasuhiko Nakashima (NAIST)
- Speaker:
**Sunao Torii**(ExaScaler Inc.),**Hitoshi Ishikawa**(PEZY Computing K.K.) - Title:
**ZettaScaler: Liquid immersion cooling Manycore based Supercomputer** - Abstract: We have developed the power-efficient super computer ZettaScaler series. In the first generation, ZettaScaler-1.x, we adopted three novel technologies, “MIMD ultra-manycore processor called “PEZY-SC”, “High density server board Brick”, and “direct liquid immersion cooling system”. They enable us to realize high performance, low power consumption, and space miniaturization at a same time. In this talk, we will explain about unique hardware technologies and programming model of ZettaScaler-1.0 generation. In addition, we will mention 3-dimensional mounting technology with DRAM by magnetic coupling TCI (ThruChip Interface), new Brick structure, and enhanced cooling system with respect to the next generation ZettaScaler-2.0 which is currently being constructed.

## AFCA Keynote

- Chair: Hiroshi Umeo (Osaka Electro-Communication University)
- Speaker:
**Georgios Ch. Sirakoulis**(Democritus University of Thrace) - Title:
**Cellular Automata based computing** - Abstract: Cellular automata (CAs) have been established as one of the most promising computing architectures for post von Neumann computing era. To this end, their intrinsic characteristics such as inherent full parallelism, tolerable design and emergent computation have been proven the corner stones for the development of efficient and robust computational models able to reproduce sufficiently complex physical and chemical processes and phenomena. In this talk, CAs will be envisaged as the finest medium for the successful implementation of non von Neumann computing paradigms arriving directly from biological organisms, nature inspired and chemical processing. To this end, special attention will be given in the exact hardware implementation of the proposed CAs that spans from digital implementations in Field Programmable Gate Arrays (FPGAs) devices up to analog hardware implementations by the usage of nanoelectronic memristor devices. The proposed models perform effectively as enhanced CA based biological and chemical virtual computers with unrivalled computing features, referring to resulting complexity and performance, that become especially apparent when performing complex and np-complete computational tasks.

## ASON Keynote

- Chair: Takuya Asaka (Tokyo Metropolitan University)
- Speaker:
**Takumi MIYOSHI**(Shibaura Institute of Technology) - Title:
**Location-based Peer-to-peer Communication: Bringing Efficient Locality-aware Networks** - Abstract: Peer-to-peer (P2P) communications have been being widely used in many areas since the popularization of file sharing systems. P2P systems generally form a virtual overlay network, but which might be physically inefficient due to disregard of peers locations. We expect that P2P has more potential when it is used in combination with location information of peers. In this talk, I will show two research topics, related to location-aware P2P communications we are involved in, router-aided P2PTV traffic localization and location-based P2P communication system. In the former topic, we propose the hierarchical delay insertion method on P2PTV. The method, by degrading longer-distance communication between peers, affects the peer selection mechanism in P2PTV applications, which thus tends to connect geographically closer peers. In the latter topic, we propose a location-based P2P communication architecture. This concept will realize neighborhood communications such as inter-vehicular information exchange, mutual cooperation network at natural disasters, and event-oriented temporary communication.

## CSA Keynote

- Chair: Tomoaki Tsumura (Nagoya Inst. of Tech.)
- Speaker:
**Mutsumi Kimura**(Ryukoku Univ.) - Title:
**Neuromorphic Hardware using Simplified Elements and Thin-Film Semiconductor Devices** - Abstract: Neuromorphic hardware using simplified elements and thin-film semiconductor devices is proposed. First, we succeeded in simplifying processing elements: a neuron circuit is composed of four transistors and a synapse device consists of only a conductive film. Here, the characteristic change of the thin-film semiconductor devices is utilized as the weight change of the synapse device by adopting a training rule named modified Hebbian learning. By these simplifying elements, astronomical number of processing elements come to be able to be integrated. Next, in order to confirm the elementary operation, we developed a cellular neural network using poly-Si thin-filmdevices, which learned simple logic function and alphabet reproduction. Now, we have developed a Hopfield neural network using amorphous oxide thin-film devices, which also learned simple logic function and alphabet reproduction. We believe that neuromorphic hardware using thin-film semiconductor devices will be a key technologies for artificial intelligence because three-dimension integration is possible, the system size and power consumption can be reduced.

## GCA Keynote

- Chair: Jacir L. Bordim (University of Brasilia)
- Speaker:
**Takashi Ishida**(Tokyo Institute of Technology) - Title:
**GPU-accelerated computing for bioinformatics** - Abstract: These days, the size of biological data has been rapidly increased because of the improvement of technology, such as DNA sequencers and so on. Therefore, for analyzing such huge amounts of data, the acceleration of bioinformatics software is highly demanded. However, the variety of bioinformatics applications is so large, and thus we have to develop specific algorithms for each application. In this talk, we introduce GPU-implementation of two different bioinformatics applications, protein-protein interaction prediction based on docking calculation and sequence homology search for metagenome analysis.

## LHAM Keynote

- Chair: Hiroyuki Takizawa (Tohoku University)
- Speaker:
**Albert Farrés**(Barcelona Supercomputing Center) - Title:
**Optimization strategies for HPC applications on newer and upcoming architectures** - Abstract: New hardware alternatives appear as a potential solution to satisfy the high demands of the computing power of scientific applications. The last decade has seen a trend on building systems with dedicated devices and accelerators, which produce a good FLOPs/Watt ratio. Among those alternatives there is a clear trend in favor of accelerated environments where some kind of specific purpose device is attached to the host. Among such accelerated devices there are the GPGPUs from Nvidia and Intel Xeon Phi. Developing high quality applications obtaining the maximum performance of the underlying architecture is not a trivial task. In this work we show several common optimizations that can be applied to improve the performance of our codes on these newer and upcoming architectures. Also, software design patterns will be introduced to easily implement these optimizations. Specifically, this work focus on several optimization strategies evaluated and applied to an elastic wave propagation engine, based on a Fully Staggered Grid, running on the latest Intel Xeon Phi processors, the second generation of the product (code-named Knights Landing). The evaluated set of optimizations ranges from memory to compute optimizations.

## PDAA Keynote

- Chair: Akihiro Fujiwara (Kyushu Institute of Technology)
- Speaker:
**Taisuke Izumi**(Nagoya Institute of Technology) - Title :
**Information-Theoretic Approach for Lower Bounds in Resource-Bounded Computation** - Abstract : By recent growth on the volume of data, it is no more a valid assumption that whole of input data is placed on the memory of a stand-alone computer. It yields the interest of several unconventional computation models, such as distributed/parallel computing, stream computation, and the computation using bounded working memory. In this talk, we present how the information-theoretic approach is useful and helps for obtaining lower-bound results in those models.

## WICS Keynote

- Chair: Yasuyuki Nogami (Okayama University)
- Speaker:
**Noboru Kunihiro**(The University of Tokyo) - Title:
**Recovering RSA Secret Keys from Noisy Keys** - Abstract: RSA is the most widely used cryptosystem and its security is based on the difficulty of factoring a large composite. Furthermore, the side-channel attacks are a real threat to RSA scheme. In this talk, we will present how to recover the RSA secret key from a noisy version of the secret key obtained through physical attacks such as cold boot and side channel attacks. We have proposed several polynomial-time algorithms for recovering RSA secret keys from the analog observed data. We will explain those algorithms within the unified framework and give the success conditions for the attacks.

## WANC Keynote

- Chair: Shuichi Ichikawa (Toyohashi University of Technology)
- Speaker:
**Shinya TAKAMAEDA-YAMAZAKI**(Hokkaido University) - Title:
**Accelerating Deep Learning by Hardware/Algorithm Co-Design** - Abstract: For the upcoming post-Moore's era, innovative computer architecture technologies other than traditional processor technologies are desired for continuous improvement of computing performance and efficiency. As an important trend of computer architecture, deep neural network is now utilized as the general purpose technology for various applications, and various hardware accelerators for deep neural network processing are proposed. For the continuous evolution of computers, qualitative changes in computer architecture by co-design of algorithms and hardware will be important. In this talk, we present our recent works of the hardware/algorithm co-design for efficient deep neural network processing, such as the low-power quantized neural network processor and hardware-aware learning technology. Rather than implementing a hardware for the fixed algorithm, architecture design for exploiting algorithm characteristics and hardware-aware algorithm design provide higher computing performance, energy efficiency, and quality of obtained solutions by computing.

## GraphGolf Keynote

- Chair: Ikki Fujiwara (NICT)
- Speaker:
**Osamu Watanabe**(Tokyo Institute of Technology) - Title:
**Graph Golf at SuperCon 2016** - Abstract: Super Computer Programming Contest (SuperCon, in short) is a programming contest for high school students using a super computer system at Tokyo Institute of Technology and Osaka University. It is held every summer since 1995. A unique feature of the contest is that the goal of each contest is to develop a program for solving one computationally hard problem during four days of the contest. In the summer of 2016 we used a problem from the Graph Golf. In this talk, I will explain how high school students attacked the problem.