jwork.org

Science, Technology and Computation

Arguments in Favor of Python for Web Scraping

December 17, 2024 Reading time: 4 minutes

Because of its portability, large community, and ease of use and learning, Python is one of the most widely used programming languages worldwide. All contemporary data-related domains, such as web scraping projects, machine learning, and data analysis, are dominated by this language. Python makes it much simpler to write a Hello World program than most other programming languages, particularly those that resemble C.

Nevertheless, there are difficulties with web scraping. There are many different types of websites, styles, and technologies available. Each website is constructed in a unique way. The majority of websites are usually unique, but users will come across generic structures that may be repeated. These websites are also constantly changing. This implies that while a Python script may function properly at one point, it may encounter an error and be unable to retrieve data at a later time.

Read more ...


Stunning 3D visualization with JavaView

February 21, 2021 Reading time: 3 minutes

JavaView (http://www.javaview.de/) is a 3D geometry viewer and a mathematical visualization software known since 90x. The program is written in Java, and enables a smooth integration into commercial software like Mathematica and Maple. JavaView can be used for 3D scientific visualization, geometric modeling, variational optimization, vector fields etc.

Read more ...


Using Multi-Layer Recurrent Neural Network for language models

February 22, 2018 Reading time: ~1 minute

Here is another example of how to use Multi-Layer Recurrent Neural Network (RNN package) designed for character-level language models. This neural network was trained using 165,000+ real titles of acts submitted to the Congress from CONGRESS.GOV. The training was performed using GPU. Then the trained RNN was used to create "fake" titles. Use this link to find which bill title is real and which is created by RNN.

This example of the RNN package is provided by Jahred Adelman (NIU).


Using Neural Networks to create titles for scientific papers

February 22, 2017 Reading time: ~1 minute

Neural networks are getting smart and their outputs become increasingly realistic. Here is a neural network example created by Jahred Adelman who used the Multi-Layer Recurrent Neural Network (RNN package) designed for character-level language models. The neural net was trained using 600,000+ real paper titles taken from INSPIRE. Then it was used to create "fake" paper titles.   Use this link to find which article title is fake and which is real.

S.Chekanov (ANL)


Recasting Java neural networks in Python

February 22, 2016 Reading time: 2 minutes

Many neural network applications implemented in Java, such as Neuroph, Encog and Joone, may look rather different when switching from the Java language  to Python with the help of  the DMelt computing environment. First of all, they look simpler. You can use your favorite Python tricks to load and display data. The Python coding is simpler for viewing and fast modifications. It does not require recompiling after each change. At the same time, the platform independence and multi-threading environment are guaranteed by the Java Virtual Machine.

As a simple example, consider Joone, the Java Object Oriented Neural Network. It is well-tested open source project that offers a highly adaptable neural networks for Java programmers. For the purposes of this blog, we will teach Joone to recognize a very simple pattern, such as XOR. The XOR operation's truth table is summarized as

[[0.0, 0.0, 0.0],[0.0, 1.0, 1.0],[1.0, 0.0, 1.0],[1.0, 1.0, 0.0]]

where three first two columns are inputs, and the last column is the output. Now let us put together a simple Python code that uses Joone's Java API:

http://datamelt.org/code/code.php?id=49760300.py

We assume that the neural network uses two input layers, one hidden (with 3 neurons), and one output layer (with one neuron). Now, start the DMelt program, and run this code in its editor. If you are using Mac/Linux, you can run this code in a batch mode with the command "dmelt_batch.sh code.py". You will see how the global error changes as a function of epoch. This trend is shown here.

Joone can be used for rather sophisticated tasks that involve future predictions. A popular example using Python calling Joone library deals with forecast in time series. For this task, we generate a time series in the form cos(x)*sin(x), add nose, and use this as input for our neural network training. The output image of our forecast code implemented in Python is shown here.

See more neural network examples implemented in Python/Jython and Java in DMelt Neural Network code examples.