2018-12-13

The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-


Recently when I run Spark + HDFS on Win7 desktop. I always got following error message in Spark's driver log.

java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-

Somebody suggested if I can run winutils ls /tmp/hive in Win7. I tried, and got this error.

access denied or FindFileOwnerAndPermission error (1789): The trust relationship between this workstation and the primary domain failed.

It seems something was wrong in my local Win7. I struggled for a time, and finally found the reason. Recently I modified my DNS of local network connection, and provided a wrong IP address. Winutils just could not connect to our Domain Controller for user verification. Once I set up a correct DNS of my network, and restart the machine, the issue was gone!

2018-12-06

Show Chinese Characters in Matplotlib Plot on Mac-OS

Mac-OS uses different font file from Windows. If a string containing Chinese characters is show in a chart of matplatlib, the Chinese character will be shown as [][]. And there is a warning message:

UserWarning: findfont: Font family [u'sans-serif'] not found. Falling back to Bitstream Vera Sans (prop.get_family(), self.defaultFamily[fontext]))

Following steps can solve this issue:

  1. Download Chinese font simhei.ttf from http://www.font5.com.cn/font_download.php?id=151&part=1237887120
  2. Extract the file RAR file and copy simhei.ttf to anaconda3/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/ttf.
  3. Edit anaconda3/lib/python3.7/site-packages/matplotlib/mpl-data/matplotlibrc. Find sans-serif. Uncomment this line, and save the change.
  4. Remove all files and directories under ~/.matplotlib.
  5. Restart jupyter notebook.
  6. In Python programe, add following lines before plot:

        from pylab import mpl
        mpl.rcParams['font.sans-serif'] = ['SimHei']
        mpl.rcParams['axes.unicode_minus'] = False
     
Okay, Chinese characters can be displayed correctly in chart.

Labels: , , ,

2017-02-24

Mount NTFS disk to Mac OS for Read and Write

Most of the removable hard disks are in NTFS format. When it is attached to Mac OS, it is automatically mounted as a read only file system. You can not copy files on to it. In order to write data, usually you have to find some software to support writable NTFS, for example, Mounty.

I do not suggest you download Mounty. Many users have found this software has a very serious bug. In some unknown cases, it damages the hard disk. It probably does not manage the indices of the NTFS well. If it unfortunately happens, you can not write any data. When you attache it back to Windows, the system warns that this disk is not usable. You have to use a tool named chkdsk to repair it. Sometimes, data is lost.

Apple does include NTFS module into Mac OS. That's why it support read-only NTFS. It does not naturally support writable NTS. Is it because NTFS is a Microsoft standard?

Anyway, we can manually mount a writable NTFS on to Mac OS.

First we should find the disk device name. Attache the hard disk to the USB socket. Run command:

$ df
....
/dev/disk5s1                  1953520808 1457336616  496184192    75%   150877  248094691    0%   /Volumes/TOSHIBA EXT
....

This entry /dev/disk5s1 is for my disk, which is mounted at point "/Volumes/TOSHIBA EXT"

First umount it from the system, and then mount it as a writable device:

$ sodu umount /dev/disk5s1
$ sodu mount_ntfs -o rw,sync,nobrowse /dev/disk5s1 /Users/trumpet/toshiba

The new mount point is /Users/trumpet/toshiba. It is a read-write device now.

Put these commands into a shell script to configure it easily.

However, I still find a case that it damages the file system. I think it is an issue in Mac OS implementation. When you copy files to it, keep the console awake, not sleeping.

Labels: , , , , , , ,

2017-02-15

Configure VPN Route Table on Mac


Shimo is a tool on Mac to create a VPN tunnel. Once the connection is created, it changes the default gateway and set all destinations to the VPN tunnel. If the VPN is to the intranet of office, for security consideration, it is fine. The local machine becomes a part of the intranet. However, if the VPN is slow, or it is charged, the connection to the hosts of the Internet will be slow or expensive. A possible way is to configure the route table such that only the destinations in the intranet is through the tunnel, but all other hosts in the world is directly from the local machine.

First verify the default route table with command route:

~$ route get www.somehost.com
   route to: 14.215.177.38
destination: default
       mask: default
    gateway: 192.168.1.1
  interface: en1
      flags:
 recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
       0         0         0         0         0         0      1500         0 
~$ 

It shows that by default, the host is reached via the default gateway 192.168.1.1 of interface en1.

Dial the VPN, and try again:

~$ route get www.somehost.com
   route to: 14.215.177.38
destination: default
       mask: default
  interface: ppp0
      flags:
 recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
       0         0         0         0         0         0      1396         0 
~$ 

It is routed to the new default gateway of interface ppp0.

Now we use route command to change the route table:

route delete default
route -n add default 192.168.1.1
route -n add -net 20.201.0.0 -netmask 255.255.0.0 20.201.0.124

First line, delete the default gateway. Then add the original default gateway 192.168.1.1. Finnally add a new item, where 10.101.0.0 is the network of the intranet, and 255.255.0.0 is the network mask. 20.201.0.124 is the gateway of the tunnel. If the gateway of the tunnel is unknown, try this command after the tunnel is created:

netstat -rn

It can be found it in the output.

These three commands need root privilege. Use sudo to execute them.

Now let's verify the result. The host www.somehost.com is routed to 192.168.1.1. And a host in the intranet is routed to 20.201.0.124:

~$ route get 20.201.10.101
   route to: 20.201.10.101
destination: 20.201.0.0
       mask: 255.255.0.0
    gateway: 20.201.0.124
  interface: ppp0
      flags:
 recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
       0         0         0         0         0         0      1396         0 
~$ 

Keep in mind that this route table is not safe because the local machine is not protected by the firewall of the intranet.







Labels: , , ,

2015-11-06

Java Mail Attachment with Long File Name and Double Byte Characters

Using JavaMail utility to send email via SMTP, we should encode subject andfile name of attachment if there are double byte characters. It isencoded by the methodMimeUtility.encodeText(text).

The behavior of Java Mail 1.4 and 1.5 isdifferent in handling file name if it is too long. This is the caseof 1.4 in email raw content:
Content-Type: application/octet-stream;
name="=?gb2312?B?w8C5+sjLttShsNK7tPjSu8K3obG1xCDVvSDC1CDLvL+8LTIubW9iaQ==?="
And this is the case of 1.5:
Content-Disposition: attachment;
filename*0="=?GBK?B?w8C5+sjLttShsNK7tPjSu8K3obG1xCDVvSDC1CDLvL+8LTIubW9i";
filename*1="aQ==?="
In order not to avoid the limitation of thelength of each text line, Java Mail 1.5 cuts the file name orencoded file name into multiple lines. However, many emailprocessors or email servers do not handle this new behavior. Thefile is usually renamed as atxxx.octet-stream or ATTxxxxx.dat. Ofcourse, the old case will not always work if the file name is toolong.
If the length is not an issue, be careful toupgrade Java Mail to 1.5.

Labels: , ,

2015-06-11

Add a Startup Class in Jetty


Add a Startup Class in Jetty

Jetty provides an approach, known as module, to start a Class while Jetty server is started. Add following line in start.ini:

--module=tasks

The new module name is tasks. We should have a simple module modules/tasks.mod with content:

[xml]
etc/tasks.xml

And etc/tasks.xml with content:

<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure_9_0.dtd">

<Configure id="tasks" class="com.mycom.task.Schedular">
</Configure>

The Class com.mycom.task.Schedular should have a constructor with no argument.

Start Jetty server with command start -jar start.jar, an instance of com.mycom.task.Schedular will be initialized.

2015-05-21

Configure the Native Logger of Jetty

Jetty has a native logger component. User is able to configure a module using log4j logger. Here is the configuration of the native logger.

1. Add one line in start.ini to enable logging module

--module=logging

2. Modify logging module etc/jetty-logging.xml with new properites

<Arg><Property name="jetty.logs" default="E:/work3/bw5/M3O_Cloud/Automation/log"/>/server.log</Arg>

3. Restart tetty server with command

java -jar start.jar

Labels: