Wednesday, September 5, 2018

String/Templating Modules


String Pattern Matching
-----------------------

`re`
This module uses regular expressions for advanced string processing.

>>> import re
>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
['foot', 'fell', 'fastest']h
>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'

For simple cases, you can make use of "string methods" like this one below.

>>> 'tea for too'.replace('too', 'two')
'tea for two'

Compiling an expression
  --> patterns are compiles into bytecode and executed by mactching engine in C
  --> this way makes it run faster

>>> import re
>>> p = re.compile('[a-z]+')
>>> p
re.compile('[a-z]+')
>>> p.match("")
>>> print(p.match(""))
None
>>> print(p.match("abc"))
<_sre .sre_match="" 3="" match="abc" object="" span="(0,">
>>>


match() vs search()
  --> match() - searches at the beginning of string
  --> search() - searches anywhere on the string

>>> re.match('[a-z]+', '123abc456')
>>>
>>> re.search('[a-z]+', '123abc456')
<_sre .sre_match="" 0x7f769f7fd4a8="" at="" object="">
>>> re.match('[a-z]+', 'abc456')
<_sre .sre_match="" 0x7f769ddf5cc8="" at="" object="">
>>>


Templating
----------

`string`
You can use `Template` class from this module to create a base string that has editable
values.
>>> from string import Template
>>> t = Template('${village}folk send $$10 to $cause.')
>>> t.substitute(village='Nottingham', cause='the ditch fund')
'Nottinghamfolk send $10 to the ditch fund.'

`substitute()` method will raise `KeyErorr` exception if there is a missing key value but you
can bpyass that.
>>> t = Template('Return the $item to $owner.')
>>> d = dict(item='unladen swallow')
>>> t.substitute(d)
Traceback (most recent call last):
  ...
KeyError: 'owner'
>>> t.safe_substitute(d)
'Return the unladen swallow to $owner.'

You can also do a batch renamer like this:
>>> import time, os.path
>>> photofiles = ['img_1074.jpg', 'img_1076.jpg', 'img_1077.jpg']
>>> class BatchRename(Template):
...     delimiter = '%'
>>> fmt = input('Enter rename style (%d-date %n-seqnum %f-format):  ')
Enter rename style (%d-date %n-seqnum %f-format):  Ashley_%n%f

>>> t = BatchRename(fmt)
>>> date = time.strftime('%d%b%y')
>>> for i, filename in enumerate(photofiles):
...     base, ext = os.path.splitext(filename)
...     newname = t.substitute(d=date, n=i, f=ext)
...     print('{0} --> {1}'.format(filename, newname))

img_1074.jpg --> Ashley_0.jpg
img_1076.jpg --> Ashley_1.jpg
img_1077.jpg --> Ashley_2.jpg

Tools for working with Lists
----------------------------

`array`
Stores homogeneous data and stores it compactly.
 
>>> from array import array
>>> a = array('H', [4000, 10, 700, 22222])
>>> sum(a)
26932
>>> a[1:3]
array('H', [10, 700])

* NEED MORE READING *
`deqeue`
Can be used for faster appends and pops from the left but with slower lookups in the middle.
 
>>> from collections import deque
>>> d = deque(["task1", "task2", "task3"])
>>> d.append("task4")
>>> print("Handling", d.popleft())

Handling task1
unsearched = deque([starting_node])
def breadth_first_search(unsearched):
    node = unsearched.popleft()
    for m in gen_moves(node):
        if is_goal(m):
            return m
        unsearched.append(m)

* NEED MORE READING *
`bisect`
Can manipulate sorted lists by automatically insert element on their correct position.
 
>>> import bisect
>>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')]
>>> bisect.insort(scores, (300, 'ruby'))
>>> scores
[(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')]
`heapq`
Can be used by applications that repeatedly access the smallest element(s) but don't want to
a run a full list sort.
 
>>> from heapq import heapify, heappop, heappush
>>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
>>> heapify(data)                      # rearrange the list into heap order
>>> heappush(data, -5)                 # add a new entry
>>> [heappop(data) for i in range(3)]  # fetch the three smallest entries
[-5, 0, 1]

Collections
-----------

collections.Counter()
>>> from collections import Counter
>>> l = ['a', 'c', 'b', 'd', 'a']
>>> c = Counter(l)
>>> c
Counter({'a': 2, 'd': 1, 'c': 1, 'b': 1})

Json
----

Basics
Sample json data:
{
"a": "apple",
"b": "banana",
"c": "carrot"
}
Loading json data
from a file
>>> with open('file.json') as f:
...     data = json.load(f)
>>> data = json.load(open('file.json'))
Loading json data
from a string
>>> json.loads('{"a": "apple", "b": "banana"}')
{'a': 'apple', 'b': 'banana'}
>>>

Random
------

Generate random strings
''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))
''.join(random.choices(string.ascii_uppercase + string.digits, k=N))

Itertools
---------

Generates permutation (useful for cracking passwords)
>>> import itertools
>>> for i in itertools.permutations('abc'):
...   print(i)
...
('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')
>>>

Sunday, September 2, 2018

File I/O Modules


File Wildcards
--------------

`glob`
This module provides a function for making a lists from directory wildcard searches
>>> import glob
>>> glob.glob('*.py')
['primes.py', 'random.py', 'quote.py']

Working on Binary Data
----------------------

`struct`
Contains `pack()` and `unpack()` functions to loop through header information without using
`zipfile` module.
import struct

with open('myfile.zip', 'rb') as f:
    data = f.read()

start = 0
for i in range(3):                      # show the first 3 file headers
    start += 14
    fields = struct.unpack('
    crc32, comp_size, uncomp_size, filenamesize, extra_size = fields

    start += 16
    filename = data[start:start+filenamesize]
    start += filenamesize
    extra = data[start:start+extra_size]
    print(filename, hex(crc32), comp_size, uncomp_size)

    start += extra_size + comp_size     # skip to the next header

Sample output:
b'config-err-t6uao6' 0x0 0 0
b'gnome-software-34B15Y/' 0x0 0 0
b'gnome-software-GYBW5Y/' 0x0 0 0

Socket
------

Basics
AF_INET - ipv4 address famnily
STREAM - TCP socket type
Methods
socket.gethostname() - returns your computers haostname
socket.socket() - creates a socket object
simple client-server connection setup
1. create server socket
>>> import socket
>>> s = socket.socket()  # creates a socket object
>>> host = socket.gethostname()
>>> port = 8000
>>> s.bind((host, port))
# puts socket in listening state (queues 5 connections before rejecting others)
>>> s.listen(5)
# accepts an incoming connection (returns if a connection was accepted,
# otherwise; it will just hang)
>>> conn, addr = s.accept()

2. create client socket
>>> import socket
>>> s = socket.socket()
>>> host = 'remote.system.com'
>>> port = 8000
# connects to remote system
>>> socket.connect((host, port))

Server's s.accept() and client's s.connect() are peers. If the other
one is not alive, the other will just hang. For example, if client
launched s.connect() first before server launches its s.accept(),
client side will hang and will not return until server launches its
s.accept(). Same true if the other way around has happened.
sending and receiving
1. server
# continuing the example above, will make our server be able to receive
# 100 bytes at a time (buffer size). This will return once it received
# something from the other end.
>>> conn.recv(100)

2. client
# To send a string, add `b` so it will be converted into byte type
>>> s.send(b'Hello world\n')
# You can also open a file in binary mode and send it.
>>> f = open('grocery list.txt', 'rb')
>>> data = f.read()
>>> f.close()
>>> s.send(data)
sending to multiple client sockets
# Continuing the examplese above, let's create 2 client sockets from the server
>>> clientsocket1, addr = s.accept()
# on client1, execute s.connect((host, port))
>>> clientsocket2, addr = s.accept()
# on client2, execute s.connect((host, port))
>>>
# Now, send separate messages to each client using their respective sockets
>>> clientsocket1.send(b'Hi client1')
# On client1, do a s.recv(4096) to receive the data
>>> clientsocket2.send(b'Hi client2')
# On client2, do a s.recv(4096) to receive the data
preparing data for transmission
You can only send data in its binary form. Here are ways on how to do it.
>>> conn.send(b'I am no longer a string')
>>> response = 'Hello {}'.format('world')
>>> conn.send(response.encode('utf-8')

On the receiving end, you can decode the binary data by using .decode:
>>> received_data = s.recv(1024)
>>> print(received_data.decode('utf-8'))
right way of closing connection
If client calls `close()`, server will receive 0 byte response for every `recv()` calls.


Tuesday, August 28, 2018

Math/Number Modules


`math`
main module for mathematical computations
>>> import math
>>> math.cos(math.pi / 4)
0.70710678118654757
>>> math.log(1024, 2)
10.0
`random`
You can use this module to generate random values.

>>> import random
>>> random.choice(['apple', 'pear', 'banana'])
'apple'
>>> random.sample(range(100), 10)   # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random()    # random float
0.17970987693706186
>>> random.randrange(6)    # random integer chosen from range(6)
4
`statistics`
Module for statistical calculation
>>> import statistics
>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
>>> statistics.mean(data)
1.6071428571428572
>>> statistics.median(data)
1.25
>>> statistics.variance(data)
1.3720238095238095
`decimal`
Can be used by applications that requires precise calculations.

This calculates 5% tax on a 70 cent phone.
>>> from decimal import *
>>> round(Decimal('0.70') * Decimal('1.05'), 2)
Decimal('0.74')
>>> round(.70 * 1.05, 2)
0.73

Performs modulo calculations and equality test that are unsuitable for binary float point.
>>> Decimal('1.00') % Decimal('.10')
Decimal('0.00')
>>> 1.00 % 0.10
0.09999999999999995

>>> sum([Decimal('0.1')]*10) == Decimal('1.0')
True
>>> sum([0.1]*10) == 1.0
False

Performs very precise calculations.
>>> getcontext().prec = 36
>>> Decimal(1) / Decimal(7)
Decimal('0.142857142857142857142857142857142857')
functools
>>> from functool import reduce
>>> reduce(lambda x, y: x + y, [1, 2, 3, 4, 5])
15
>>>

Wednesday, August 15, 2018

Python Date/Time modules


`datetime`
Can provide time difference calculations

>>> # dates are easily constructed and formatted
>>> from datetime import date
>>> now = date.today()
>>> now
datetime.date(2003, 12, 2)
>>> now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.")
'12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.'
>>> # dates support calendar arithmetic
>>> birthday = date(1964, 7, 31)
>>> age = now - birthday
>>> age.days
14368

Using strptime()

>>> from datetime import datetime
>>>
>>> datetime.strptime('2016-01-01', '%Y-%m-%d')
datetime.datetime(2016, 1, 1, 0, 0)
>>>
>>> datetime.strptime('03-01-2018', '%Y-%m-%d')
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/lib/python2.7/_strptime.py", line 332, in _strptime
(data_string, format))
ValueError: time data '03-01-2018' does not match format '%Y-%m-%d'
>>>
>>>
>>> datetime.strptime('2016-01-01', '%Y-%m-%d %H:%M:%S')
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/lib/python2.7/_strptime.py", line 332, in _strptime
(data_string, format))
ValueError: time data '2016-01-01' does not match format '%Y-%m-%d %H:%M:%S'
>>>
>>> datetime.strptime('2016-01-01 04:16:34', '%Y-%m-%d %H:%M:%S')
datetime.datetime(2016, 1, 1, 4, 16, 34)        
>>>


Managing time differences

>>> a = timedelta(days=365)
>>> b = timedelta(days=100)
>>> a - b
datetime.timedelta(265)
>>>

Callable methods on datetime

>>> date = datetime.strptime('2018-03-18', '%Y-%m-%d')
>>>
>>> dir(date)
['add', 'class', 'delattr', 'dir', 'doc', 'eq', 'format', 'ge', 'getattribute', 'gt', 'hash', 'init', 'init_subclass', 'le', 'lt', 'ne', 'new', 'radd', 'reduce', 'reduce_ex', 'repr', 'rsub', 'setattr', 'sizeof', 'str', 'sub', 'subclasshook', 'astimezone', 'combine', 'ctime', 'date', 'day', 'dst', 'fold', 'fromordinal', 'fromtimestamp', 'hour', 'isocalendar', 'isoformat', 'isoweekday', 'max', 'microsecond', 'min', 'minute', 'month', 'now', 'replace', 'resolution', 'second', 'strftime', 'strptime', 'time', 'timestamp', 'timetuple', 'timetz', 'today', 'toordinal', 'tzinfo', 'tzname', 'utcfromtimestamp', 'utcnow', 'utcoffset', 'utctimetuple', 'weekday', 'year']
>>>
>>> date.timestamp()
1521302400.0
>>>

`time`
Can do simple time operations and also include a sleep function.
>>> import time
>>> dir(time)
['CLOCK_MONOTONIC', 'CLOCK_MONOTONIC_RAW', 'CLOCK_PROCESS_CPUTIME_ID', 'CLOCK_REALTIME', 'CLOCK_THREAD_CPUTIME_ID', '_STRUCT_TM_ITEMS', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'altzone', 'asctime', 'clock', 'clock_getres', 'clock_gettime', 'clock_settime', 'ctime', 'daylight', 'get_clock_info', 'gmtime', 'localtime', 'mktime', 'monotonic', 'perf_counter', 'process_time', 'sleep', 'strftime', 'strptime', 'struct_time', 'time', 'timezone', 'tzname', 'tzset']
>>> time.localtime
>>> time.localtime()
time.struct_time(tm_year=2017, tm_mon=8, tm_mday=29, tm_hour=18, tm_min=20, tm_sec=35, tm_wday=1, tm_yday=241, tm_isdst=0)
>>>
>>>
>>>
>>> time.sleep(3)
>>>
`timeit`
Measures speed of small code snippets.
>>> from timeit import Timer
>>> Timer('t=a; a=b; b=t', 'a=1; b=2').timeit()
0.57535828626024577
>>> Timer('a,b = b,a', 'a=1; b=2').timeit()
0.54962537085770791

For larger codes, use `profile` and `pstats`

Sunday, August 12, 2018

Some python number modules


`math`
main module for mathematical computations
>>> import math
>>> math.cos(math.pi / 4)
0.70710678118654757
>>> math.log(1024, 2)
10.0
`random`
You can use this module to generate random values.

>>> import random
>>> random.choice(['apple', 'pear', 'banana'])
'apple'
>>> random.sample(range(100), 10)   # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random()    # random float
0.17970987693706186
>>> random.randrange(6)    # random integer chosen from range(6)
4
`statistics`
Module for statistical calculation
>>> import statistics
>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
>>> statistics.mean(data)
1.6071428571428572
>>> statistics.median(data)
1.25
>>> statistics.variance(data)
1.3720238095238095
`decimal`
Can be used by applications that requires precise calculations.

This calculates 5% tax on a 70 cent phone.
>>> from decimal import *
>>> round(Decimal('0.70') * Decimal('1.05'), 2)
Decimal('0.74')
>>> round(.70 * 1.05, 2)
0.73

Performs modulo calculations and equality test that are unsuitable for binary float point.
>>> Decimal('1.00') % Decimal('.10')
Decimal('0.00')
>>> 1.00 % 0.10
0.09999999999999995

>>> sum([Decimal('0.1')]*10) == Decimal('1.0')
True
>>> sum([0.1]*10) == 1.0
False

Performs very precise calculations.
>>> getcontext().prec = 36
>>> Decimal(1) / Decimal(7)
Decimal('0.142857142857142857142857142857142857')
functools
>>> from functool import reduce
>>> reduce(lambda x, y: x + y, [1, 2, 3, 4, 5])
15
>>>

Tuesday, August 7, 2018

Linux Files and Directories


Basics
------


File permissions:

.--- file type
|       .--- permission (uuugggooo)
|      |   .--- acl flag
|      |  | .--- number of links
|      |  | |   .--- user
|      |  | |   |    .--- group
|      |  | |   |    |       .--- size
|      |  | |   |    |       |      .--- date of modification
|      |  | |   |    |       |      |         .--- file name
|      |  | |   |    |       |      |         |
-rwxr-xr-x. 1 root users     6 May  7 16:59 file


Locating Files
--------------

Using find:

locating and deleting files
# deletes w/o confirmation

find / -name *garbage* -exec rm -f {} \;

# includes files with whitespaces on filenames

find -type f | xargs -d "\n" rm

# its funny the latter method is much faster

[~/test]_$ time find / -xdev -type f -exec ls -l {} \; &> /dev/null
real    1m3.165s
user    0m18.897s
sys     0m44.464s
[~/test]_$
[~/test]_$ time ls -l `find / -xdev -type f` &> /dev/null
real    0m1.370s
user    0m0.814s
sys     0m0.592s
[~/test]_$

File Permissions
----------------

Basic permissions:

chmod
4 - read
2 - write
1 - execute
chmod <+/-/=>

##makes a file or directory immutable
chattr +i

##makes a file or directory unimmutable
chattr -i

##checks if an immutable flag is set on a file/directory
lsattr

chmod +[permission] file_name  ##grants permission to all users

chown   ##changes ownership of a file or directory
chown -c   ##shows information that changed
chown -f   ##disregards error messages
chown -R   ##executes the command recursively down the tree
chown -v   #verbose mode
chown :   ##simultaenousy changes both owner and group

chgrp   ##changes group ownership

Special Permissions:

SETUID (Set User ID)
## sets an SetUID (SUID) to a file or script
chmod 4
chmod u<+/->s

setUID on file:
- a program is executed with the file owner's permissions (rather than with the permissions of the user who executes it)

other notes:
- capital S means directory's group has no execute permission
- small S means directory's group has execute permission
SETGID (Set Group ID)
## sets a SetGID (SGID) to a file or directory
chmod 2
chmod g<+/->s

setGID on file:
- the effective group of an executing program is the file owner group

setGID on directory:
- a newly created file under it will inherit its group
- a newly created directory under it will will inherit its group
- if setGID was added to its permission, any files or directories under it will retain their original permission;thus, setGID's effect will only apply to newly created files and directories

other notes:
- capital S means directory's group has no execute permission
- small S means directory's group has execute permission
SVTX (Sticky Bits)
## sets a sticky bit (SVTX) to a file or directory
chmod 1
chmod +t
chmod u<+/->t

- primarily used for directories
- sticky bits on a directory prevents users to delete,modify,or rename files belonging to other users in that directory
- owner of files/directories inside can modify,delete, or rename them
- directory owner is permitted to create,delete,rename files inside but not modify


## what is the trailing dot at the end of a file permission?
drwxr-xr-x. 3 surendra surendra 4096 2011-07-06 00:19 Videos
The dot is indicating that files/folders are set with some sort of SELinux permissions on them


Directory Stacking
------------------

pushd         goes to a target dir
popd        goes to previous dir
dirs        lists dirs on stack

example:

[bob@server-new ~]$ pushd dir1
~/dir1 ~
[bob@server-new dir1]$ pushd dir2
~/dir1/dir2 ~/dir1 ~
[bob@server-new dir2]$ pushd dir3
~/dir1/dir2/dir3 ~/dir1/dir2 ~/dir1 ~
[bob@server-new dir3]$ dirs
~/dir1/dir2/dir3 ~/dir1/dir2 ~/dir1 ~
[bob@server-new dir3]$ popd
~/dir1/dir2 ~/dir1 ~
[bob@server-new dir2]$ popd
~/dir1 ~
[bob@server-new dir1]$ popd
~
[bob@server-new ~]$ popd
-bash: popd: directory stack empty
[bob@server-new ~]$

Commands
--------

File movement
# copies/moves/list only unhidden files and files w/o spaces on
# filenames
cp -p *.* /tmp
mv *.* /tmp
ls *.*
Listing
# prints directory tree (better than ls -lR)
tree

ls -l non_existing_file &> file_name  ## redirects error messages to a specific file
ls -al | tee [option] file_name  ## saves the output to a file
cat > file_name  ## allows you to type on the shell and redirect it afterwards to a text file after hitting CTRL + D

Tutorials
---------

Mass renaming
for f in awx*; do git mv "$f" "${f//awx_/}"; done
  # removes "awx_" on all files with filename starting at "awx"
to delete null device
cd /dev
rm "null 2>&1"
recovers deleted file
lsof | grep file_deleted.txt
ls -l /proc/PID_in_lsof/fd/any_number
cp /proc/PID_in_lsof/fd/any_number file_restored.txt
deletes several files in batch
rm -i `cat filelist.txt`
Listing Open Files/Processes

# lists files opened under a directory
lsof +D /var/log

# lists files currently opened by steve
lsof -u steve

# lists all filea opened by all users except steve
lsof -u ^steve

# lists processes using a specified port
lsof -i:80
fuser -v -n tcp 80
Find all hidden files
find -iname '.*' -ls

Inodes and Blocks
-----------------

- inode is set during fileystem creation time

stat file_name  ##prints the details of a file (together with inode number)
ls -i file_name  ##list the inode of a file
cp -l soft_link file_name  ##copies the soft link file rather than the original file

mv   ##moves a file from one location to another
mv   ##renames a file
*when new_file_name has a "/" at the beginning, file becomes invisible

du -h *  ##used to checks the system's block size (when 0, file is completely empty)

/dev/null  ##When written to, it discards all data
/dev/zeros  ##When read, it returns all zeros
/dev/tty  ##When accessed, it is redirected to the actual controlling device /dev/ttyx for this program

##to redirect message to a user using tty
~rell@sysx$ tty
/dev/pts/1
~rell@sysx$ echo hi /dev/pts/1
hi /dev/pts/1
~rell@sysx$

/dev/hda1  ##This refers to the first partition on the first IDE hard drive. Additional partitions are numbered /dev/hda2, /dev/hda3, etc. The second IDE harddrive is   /dev/hdb. This partition naming schemeallows direct access to any partition on any drive without any file-manager involvement
/dev/ram  ##Makes high memory look like a hard disk (for rescue)

ULIMIT
------

ulimit - max processes a user can open at a time

##to display current ulimit values
ulimit -a

##displays hard limit (max)
ulimit -Hn

##displays soft limit (warning level)
ulimit -Sn

##displays kernel max number of files
cat /proc/sys/fs/file-max

##displays number of currently open files
cat /proc/sys/fs/file-nr

note: commands above displays the ulimit values for the current user profile

UMASK
-----

## prints the currrent umask value
umask

## temporarily changes the current umask (to make changes permanent, append line to ~/.bashrc or /etc/profile. changes will apply upon next login)
umask xxx

## calculating permissions for files
666 -

## calculating permissions for directories
777 -

## umask table
Octal value        Permission
0                      ---
1                      --x
2                      -w-
3                      -wx
4                      r--
5                      r-x
6                      rw-
7                      rwx

*umask octal value for execute is by default turned off. To change execute rights, use chmod command.

File Conversions
----------------

iconv -f UTF-8 -t ASCII -c > --> converts UTF-8 to ASCII format

File Descriptors
----------------

##displays maximum number of system wide open file descriptors
cat /proc/sys/fs/file-max

##temporarily alters the value of open file descriptors
echo > /proc/sys/fs/file-max
sysctl -w fs.file-max=

##to apply changes permanently, add the following line in /etc/sysctl.conf
fs.file-max=
note: users need to logout and login again for the changes to take effect or issue the command: sysctl -p


/etc/security/limits.conf  ##you can set user ulimits here
syntax:
example:
merrell hard nofile 4096

Using "find" to search for files
--------------------------------

basics
##this prints full path names
find /home/merrell -name file*

##this prints only file names
find . -name file*
finding by size
##finds and list files in a simple format
find -xdev -size [-|+][suffix]
find -xdev -size [-|+][suffix] -print

##finds and list files in a long list format
find -xdev -size [-|+][suffix] -exec ls -l {} \;
find -xdev -size [-|+][suffix] | xargs ls -l
finding by access
and modification time
##files that are accessed n minutes ago
find -amin [-|+]

##files that are accessed n days ago
find -atime [-|+]

##files that are modified n minutes ago
find -mmin [-|+]

##files that are greater than n days old
find -mtime +

##files that are less than n days old
find -mtime -

##files that are exactly n days old
find -mtime

##finds files newer than a specific file
find /var -anewer thisfile

find . -type f -printf "%T@ %p\n" --> finds newest files
finding by file details
##based on file system type
find -fstype

##based on group id
find -gid

##based on group name
find -group

##based on inode number
find -inum

##based on file type
find -type
-d  -->directory
-f -->regular file
-b -->block

##based on file permission
find -perm

##based on file name
find -name
find -name -print

##returns all files with the following link numbers
find -links

##finds files/directories belonging to a specific user
find -user  

notes:
- if no path is specified, find will search on the current directory
- size format may be written as +1G, +100000, etc..
examples:
find / -xdev -size +1G
find / -xdev -size -1000000 -exec ls -l {} \;
adding actions
find -name -exec {} \;
find -name | xargs
options
-print -->ignores spaces in filenames
-xdev  -->doesn't look under /proc filesystem which doesn't contain real files
by permission
find / -type f -perm 600      # finds all files with "644" permissions
find . -type f -perm /222     # finds all files with atleast 1 "write" bit set
negating options
find . ! -perm 664
find . \! -type d         # escapes ! on some shells
find /tmp -not -type f    # -not is same as !
by timestamp
find /dir -type f -printf '%T+ %p\n' | sort | head -n 1  # searches old file in a directory
others
# compressed logs before today        find . -name "*log*" -atime +1 -exec gzip -v {} \;

# provides non-xero exit status when no match found        find /tmp -type f -mtime +10 | egrep '.*'
# prints absolute path        find $(pwd) -type f

Using "locate"
--------------

locate -d database_file command_or_file  ##locates a command using mlocate database
locate --database=database_file command_or_file
locate command_or_file
locate -q command_or_file  ##ignores error messages
locate -i command_or_file  ##ignores casing

slocate          

*/var/lib/mlocate/mlocate.db  ##database file for locate command

updatedb  ##updates slocate/locate database
updatedb --netpaths=' ...'
updatedb -localpaths=' ….'
updatedb -f   ##exclude a directory in the update process
updatedb --output=  ##changes the destination database of the slocate/mlocate/locate command
updatedb -U   ##updates the slocate database startign from a certain path
updatedb -f   ##excludes filesystem type from updates
updatedb -v   ##verbose mode, displays the names of all relevant files as the slocate database is updated

apropos   ##lists manual pages of a command

makewhatis  ##database that stores information reg linux commands

XARGS vs EXEC
-------------

-exec will work like this..
grep "rGEO" a.txt
grep "rGEO" b.txt

however xargs will work like..
grep "rGEO" a.txt b.txt.

- xargs is faster most of the time
- use xargs in shell scripts to have faster response time

using xargs with filename with spaces        [root@ftp01 netops]# find test_dir/A/ -type f -mtime +90 -print0 | xargs -r -0 ls -lt
-rw-r--r-- 1 root root 0 Jan  1  2013 test_dir/A/file name with spaces
[root@ftp01 netops]#

--- XARGS ---

xargs: ls: terminated by signal 13
--> output of similar commands to find /home/ -type f | xargs -r ls -lt| head -1
--> signal 13 means something is written to a pipe where nothing is read from anymore
--> safe to ignore

Links
-----

*** what is a SOFT link? ***

- a pointer to a file or directory (like windows shortcuts)
- it has a different inode
- if you delete it, original file will remain
- it can cross different types of filesystems

commands in creating a soft link        ln -s
ln -s /path/to/
how to check if a file is a soft link?        [root@secdpdevdb01 ~]# ls -li soft_link_to_apple
 125973700 lrwxrwxrwx   1 root     root           5 Jan 27 14:12 soft_link_to_apple -> apple
[root@secdpdevdb01 ~]#

** permissions are 777
** there is "l" at the beginning of the permission
** there is a pointer arrow "->" at the end
examples of relative and absolute path on soft links        lrwxrwxrwx. 1 demo tutorial   15 May 17 21:43 mylink -> ../../thismonth
lrwxrwxrwx. 1 demo tutorial   20 May 17 21:48 mylink2 -> /home/demo/thismonth

*** what is a HARD link? ***

- a pointer to a file (cannot use in directories)
- it has same inode with the original file
- if you delete it, original file will remain
- it cannot cross different types of filesystems

commands in creating a hard link        ln
ln /path/to/


# returns the path referenced by a symlink
readlink

Access Lists (ACLs)
-------------------

- The filesystem containing the directory or file must support ACL in order for ACL to work
  -> you can check if FS has ACL support by "cat /etc/fstab"
  -> you can see "acl" on column 4
  -> to mount a filesystem w/ ACL support: # mount -t ext3 -o acl /dev/VolGroup00/LogVol02 /work