I would like to express my sincere gratitude to the wine makers and those who work for the supply chain, as well as those who promoted the global free trade

We pay the same price for the same good in a perfect market. It is, however, not the case for many items that we consume in our daily lives. In this article I am going to look at the prices of wine, which are oftentimes mysterious, across the border between France and Switzerland.


In this article I briefly examine the difference between french wine prices at Coop.ch and the lowest…

Sorting an array with “Heap” data structure

In previous posts, we looked at different sorting algorithms such as Merge Sort and Quicksort, and this time another sorting algorithm, Heapsort, is discussed. Heapsort, as shown in its name, uses “Heap” data structure.

Photo by Susan Holt Simpson on Unsplash

Heap is a data structure, and heaps are arrays of binary trees. Each node of the binary tree corresponds to one element of the array. As one node has zero, one, or two children nodes, for i-th element of the array, 2i-th and (2i+1)-th elements are its left and right children respectively.

The following figure depicts an example heap in which the numbers in nodes equal…

Sorting an array with randomly selected pivots

I introduced a sorting algorithm called Merge-Sort in a previous article and continue writing about another sorting algorithm, Quicksort, in this post. The expected cost of Quicksort is Θ(nlgn), while the worst case that costs Θ(n²) would materialize only at a probability of 2/n!. I will show later in the performance comparison that the constant hidden in Θ notation is lower in Quicksort thus the algorithm outperforms Merge-Sort whose cost is same Θ(nlgn).

Photo by Andrew Buchanan on Unsplash

Insertion-Sort is a brute-force algorithm that compares all the pairs in an array to sort it. This is how people commonly sort their hands in porker or…

数列を並べ換えるアルゴリズムばかり書いていますが、とりあえず飽きずに続けています。今回学んだのはQuick Sortというアルゴリズムで、今までのものと異なり、確率論が絡んできます。

Quick Sortが何をするかというと、数列の一項(「ピボット」と呼びます)を選び、その値より小さい項をその左側に、それより大きな項を右に放り込むというものです。このアルゴリズムをピボットの左右の数列にそれぞれ使うということを繰り返してゆけば、最終的に昇順に並べ替えられた数列を得ることができますね。


Determining the closest pair of two points on a two-dimensional plane with a split-conquer algorithm

Writing cost-efficient algorithms is one of the keys to succeed as a data scientist, and in the previous article we used split-conquer method in counting inversions in an array, which is far less costly than brute force method. This time, we will see how another split-conquer algorithm finds the closest pair of points from a set of points on a two-dimensional plane.

Photo by Rémy Penet on Unsplash

Looking up at the sky at Times Square or downtown Tokyo, we would find the closest pair of stars quite easily, because you can see only a few stars there. …

日本へのお土産はWasenhaus, Julia Bertramなどドイツを中心に3ケース

年末年始は欧州、日本を6週間のわたって旅してData Scienceはお休みしていましたが、最近再開しました。相変わらず、基礎体力訓練のようなことをやっています。

今回の課題は、ランダムな重複のない数列を与えられ、逆順の数字のペアの数を数えるというもの。逆順というのは、2整数 i<j について、数列の i番目の数が j番目より大きい場合を言うようです。

数列の項数 n について、総当たりで比較する場合は非常に単純で、n(n-1)/2だけ計算すれば良いのですね。コストはΘ(n²)です。コードも単純で、loopを二つかませれば良いだけ。数行で終わってしまいます。

def bf_inversion_count(array):
count = 0
for i in range(len(a …

How much more efficiently the split-conquer method finds inverted pairs in a random array than the brute-force method does

Even with significant advancement in computing capacities in recent years, it is still costly to operate on large datasets. Writing cost-efficient codes is, thus, one of the critical skills for data scientists. While an algorithm whose cost is linear to the input data size may be not so different from one that costs square, the former offers a huge advantage when the dataset increases to millions.

One basic way to write more efficient algorithms is the split-conquer method, where we write a nested function that applies the same function to the subset of the input recursively. …

夏からData Scienceの応用的な分野を学んできましたが、思い立ってAlgorithmについて初歩から勉強し始めました。今週からCourseraのStanfordのコースを聴講しています。



def find_closest_bru …


It is widely discussed that the US economy will face another recession in coming years. The question is when and how it will occur. We have seen a decade of economic expansion but it will not last forever. In this article I will try a basic Recurrent Neural Network (RNN) model to foresee how the US economy will grow, or possibly shrink in the near future.


You may know that we have had not many recession periods in the past few decades, the most recent one being after the Global Financial Crisis of 2008. It can be a hurdle for…

6ヶ月先の米国のGDPギャップを人工知能を使って予測するモデルを作ってみました。結論は、やはり難しいというものです。Recurrent Neural Networks (RNN)というモデルを用いました。致命的な問題は、やはり経済構造や経済危機の原因は時とともに変化することと、利用可能なデータが限られていることでしょうか。それと現実的に、私のmid-2012 Macbook Proでは色々試行しているとメモリがパンパンになって動かなくなり、ストレスが溜まることも…


Unfolded basic recurrent neural network (Wikipedia)


regressor = Sequential()regressor.add(LSTM(units=50, 


regressor.add(LSTM(units=50, return_sequences=True))

regressor.add(Dense(units=1))regressor.compile(optimizer='adam', loss='mean_squared_error')regressor.fit(X_train_rnn,


Keita Miyaki

Keita is an aspiring data scientist with expertise in finance and investment, a proud Japanese national, a chef, Judo black belt, a calligrapher, and a drunkard

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store