IP(4)               Linux Programmer's Manual               IP(4)

       ip - Linux IPv4 protocol implementation

       #include <sys/socket.h>
       #include <net/netinet.h>

       tcp_socket = socket(PF_INET, SOCK_STREAM, 0);
       raw_socket = socket(PF_INET, SOCK_RAW, protocol);
       udp_socket = socket(PF_INET, SOCK_DGRAM, protocol);

       Linux implements the IPv4 protocol described in RFC791 and
       RFC1122.  ip contains a level 2  multicasting  implementa-
       tion comforming to RFC1112.  It also contains an IP router
       including a packet filter.

       The protocol is implemented in the kernel on the basis  of
       a  BSD  compatible socket interface.  For more information
       on sockets, see socket(4).

       An IP socket is created by calling the socket(2)  function
       with  a PF_INET socket family argument. Valid socket types
       are SOCK_STREAM to open a  tcp(4)  socket,  SOCK_DGRAM  to
       open  a  udp(4)  socket, or SOCK_RAW to open a raw socket.
       protocol is the  IP  protocol  in  the  IP  header  to  be
       received  or  sent.  For  TCP  and  UDP  sockets,  only 0,
       IPPROTO_TCP , or IPPROTO_UDP are valid. For  SOCK_RAW  you
       may  specify  a  valid IANA IP protocol defined in RFC1700
       assigned numbers.

       Raw sockets may only be opened by a process with effective
       user id 0 or when the process has the CAP_NET_RAW capabil-

       When a process wants to receive new  incoming  packets  or
       connections,  it  should  be  bound  to  a local interface
       address using bind(2).  When INADDR_ANY  is  specified  it
       will  bind  to any local interface.  A bound TCP socket is
       unavailable  for  some  time  after  closing,  unless  the
       SO_REUSEADDR flag is set.

       An  IP socket address is defined as a combination of an IP
       interface address and a port number.

              struct sockaddr_in {
                  sa_family_t sin_family;/* address family: AF_INET */
                  u_int16_t   sin_port; /* port in network byte order */
                  struct in_addr  sin_addr;/* internet address */

              /* Internet address. */
              struct in_addr {
                  u_int32_t s_addr; /* IPv4 address in network byte order */

       sin_family is always set to AF_INET.  This is required; in
       Linux  2.2  most  networking  functions return EINVAL when
       this setting is missing.  sin_port contains  the  port  in
       network byte order. The port numbers below 1024 are called
       reserved ports.  Only processes with the effective user id
       0 or the CAP_NET_BIND_SERVICE attribute set may bind(2) to
       these sockets. Note that the raw IPv4 protocol as such has
       no  concept of a port, they are only implemented by higher
       protocols like tcp(4) and udp(4).

       sin_addr is the host address.  The addr member  of  struct
       in_addr  contains  the  host  interface address in network
       order.   in_addr  should  be  only  accessed   using   the
       inet_aton(3), inet_addr(3), inet_makeaddr(3) library func-
       tions or directly with the name resolver  (see  gethostby-
       name(3) ). IPv4 addresses are divided into unicast, broad-
       cast and multicast addresses. Unicast addresses specify  a
       single  interface  of  a host, broadcast addresses specify
       all host on a network and multicast addresses address  all
       hosts   in  a  multicast  group.  Datagrams  to  broadcast
       addresses are only passed to  the  user  when  the  socket
       broadcast  flag  is  set.   To send datagrams to broadcast
       addresses it has to be set too.  Connection oriented sock-
       ets are only allowed to use unicast addresses.

       Note  that  the  address and the port are always stored in
       network order, this particulary means  that  you  need  to
       call  htons(3)  on  the number that is assigned to a port.
       All address/port manipulation functions  in  the  standard
       library automatically convert to network order.

       IP supports some protocol specific socket options that can
       be set with setsockopt(2) and read by getsockopt(2).   The
       socket option level for IP is SOL_IP

              Sets  or  get  the IP options to be sent with every
              packet from  this  socket.   The  arguments  are  a
              pointer  to  a  memory buffer contained the options
              and the option  length.   Setsockopt  sets  the  IP
              options  associated  with a socket.  Maximum option
              size for IPv4 is  40  bytes.  See  RFC791  for  the
              allowed   options.   When  the  initial  connection
              request packet for a SOCK_STREAM socket contains IP
              options  the  outgoing IP options will be automati-
              cally set to  the  received  options  with  routing
              headers reversed.  Thus, outgoing packets will echo
              the received options then.  After the connection is
              established  incoming  packets  are  not allowed to
              change  options  anymore.  The  processing  of  all
              incoming  source  routing  options  can be disabled
              using the accept_source_route sysctl, which is  off
              by default.  For datagram sockets IP options can be
              only set by the local user. getsockopt returns  the
              current send IP options.

              Pass a IP_PKTINFO ancillary message that contains a
              pktinfo structure that  supplies  some  information
              about  the  incoming  packet.  This  only works for
              datagram oriented sockets.

              struct in_pktinfo
                  unsigned int ipi_ifindex;   /* Interface index */
                  struct in_addr  ipi_spec_dst;/* Routing destination address */
                  struct in_addr  ipi_addr;   /* Header Destination address */

              ipi_ifindex is  the  index  of  the  interface  the
              packet  was  received on.  The ipi_spec_dst address
              is the RFC specified destination  address  and  may
              differ  from  ipi_addr  when  the  packet  contains
              source routing options.

              If IP_PKTINFO is passed to sendmsg(2) then the out-
              going packet will be sent over the interface speci-
              fied in ipi_ifindex with  the  destination  address
              set to ipi_spec_dst

              If  enabled  the IP_TOS ancillary message is passed
              with incomming packets. It contains a byte with the
              Type  of  Service/Precedence  field  of  the packet
              header as a byte.  Expects a boolean integer  flag.

              Set  or  read a flag to pass a IP_RECVTTL ancillary
              message that contains the time to live field of the
              received  packet  as  a  byte.  Not  supported  for
              SOCK_STREAM sockets.

              Pass all incoming IP  options  to  the  user  in  a
              IP_OPTIONS  control message. The routing header and
              other options are already filled in for  the  local
              host. Not supported for SOCK_STREAM sockets.

              Identical  to  IP_RECVOPTS  but  returns raw unpro-
              cessed options  with  timestamp  and  route  record
              options not filled in for this hop.

       IP_TOS Set or receive the Type-Of-Service (TOS) field that
              is sent with every IP packet originating from  this
              socket.  It  is  used  to prioritize packets on the
              network.  TOS is a byte. There  are  some  standard
              TOS   flags  defined:  IPTOS_LOWDELAY  to  minimize
              delays for interactive traffic, IPTOS_THROUGHPUT to
              optimize  throughput, IPTOS_RELIABILITY to optimize
              for reliability, IPTOS_MINCOST should be  used  for
              "filler  data" where slow transmission doesn't mat-
              ter.  At most one of these TOS values can be speci-
              fied.  Other bits are invalid and shall be cleared.
              Linux per default  sends  IPTOS_LOWDELAY  datagrams
              first,  but the exact behaviour depends on the con-
              figured queueing discipline.   Some  high  priority
              levels  may  require  an effective user id 0 or the
              CAP_NET_ADMIN attribute set.  The priority can also
              be  set  in  a  protocol  independent  way  by  the
              (SOL_SOCKET,  SO_PRIORITY)   socket   option   (see
              socket(4) ).

       IP_TTL Set  or  receive  the  time to live field for every
              outgoing IP packet.

              If enabled the user supplies his own ip  header  in
              front  of  the  user  data. Only valid for SOCK_RAW
              sockets. See raw(4) for more information. When this
              flag  is  enabled  the  values  set  by IP_OPTIONS,
              IP_TTL, IP_TOS are ignored.

              IP_RECVERR Enable extended reliable  error  message
              passing.   When  enabled  on  a datagram socket all
              generated errors will be  queued  in  a  per-socket
              error  queue.  When  the  user  gets an error (by a
              error return of a socket operation) then the errors
              can  be  received  by  calling  recvmsg(2) with the
              MSG_ERRQUEUE flag set. The sock_extended_err struc-
              ture  describing  the  error  will  be  passed in a
              ancillary message with the type IP_RECVERR and  the
              level  SOL_IP.   This  is useful for reliable error
              handling on unconnected sockets.  The received data
              portion  of  the  error  queue  contains  the error

              IP uses the sock_extended_err structure as follows:
              ee_origin   set  to  SO_EE_ORIGIN_ICMP  for  errors
              received as an ICMP packet,  or  SO_EE_ORIGIN_LOCAL
              for  locally generated errors.  ee_type and ee_code
              are set from the type and code fields of  the  ICMP
              header.   ee_info  contains  the discovered MTU for
              EMSGSIZE errors.  ee_data is  currently  not  used.
              When  the error originated from the network, all IP
              options (IP_OPTIONS, IP_TTL, etc.) enabled  on  the
              socket and contained in the error packet are passed
              as control messages.  The  payload  of  the  packet
              causing the error is returned as normal data.

              On   SOCK_STREAM  TCP  sockets,  IP_RECVERR  has  a
              slightly different semantic.  Instead  of  queueing
              the  errors reliably, it passes all incoming errors
              immediately to the user. This might be  useful  for
              very  short-lived  TCP  connection  that need quick
              error handling. Use this option with care: it makes
              TCP  unreliable by not allowing it to recover prop-
              erly from routing shifts and  other  normal  condi-
              tions.    Note   that   TCP  has  no  error  queue;
              MSG_ERRQUEUE is not invalid on SOCK_STREAM sockets.
              All errors are passed by return value only.

              For  raw sockets, IP_RECVERR enables passing of all
              received ICMP errors to the  application.  This  is
              turned off by default for compatibility.

              It  sets  or  receives  an  integer  boolean  flag.
              IP_RECVERR defaults to off.

              Sets or receives the Path MTU Discovery setting for
              a socket. When enabled, Linux will perform Path MTU
              Discovery as defined in RFC1191 on this socket. The
              system-wide    default   is   controlled   by   the
              ip_no_pmtu_disc sysctl for SOCK_STREAM sockets, and
              disabled  on  all others. The user can retrieve the
              path  MTU  using  the  IP_MTU  or  the   IP_RECVERR

              |Path MTU discovery flags | Meaning                        |
              |IP_PMTUDISC_WANT         | Use per-route settings.        |
              |IP_PMTUDISC_DONT         | Never do Path MTU Discovery.   |
              |IP_PMTUDISC_DO           | Always do Path MTU Discovery.  |

              When PMTU discovery is enabled the kernel automati-
              cally keeps track of the path MTU. For TCP  sockets
              the  outgoing packets are automatically sized based
              on the path MTU, for datagram oriented sockets  the
              user  has  to size the datagrams appropiately. When
              it is enabled the  kernel  rejects  packets  bigger
              than  the  path MTU with EMSGSIZE raw(4) and udp(4)
              for more information.

       IP_MTU Retrieve the current known path MTU of the  current
              socket.   Only  valid when the socket has been con-
              nected. Returns an integer. Only valid  as  a  get-

              Pass all forwarded packets with the IP Router Alert
              option set to this socket. Only valid for raw sock-
              ets.  This  is useful, for instance, for user space
              RSVP daemons. Expects an integer argument.

              Set or reads the  time-to-live  value  of  outgoing
              multicast  packets  for  this  socket.  It  is very
              important for multicast packets to set the smallest
              TTL  possible.   The  default is 1 which means that
              multicast packets don't  leave  the  local  network
              unless  the  user  program  explicitly requests it.
              Argument is an integer.

              Sets or reads a boolean  integer  argument  whether
              sent multicast packets should be looped back to the
              local sockets.

              Join  a  multicast  group.  Argument  is  a  struct
              ip_mreqn structure.

              struct ip_mreqn
                  struct in_addr  imr_multiaddr;/* IP multicast group address */
                  struct in_addr  imr_address;/* IP address of local interface */
                  int             imr_ifindex;/* interface index */

              imr_multiaddr contains the address of the multicast
              group the application wants to join or  leave.   It
              must  be a valid multicast address.  imr_address is
              the address of the local interface with  which  the
              system  should  join  the multicast group; if it is
              equal to INADDR_ANY  an  appropriate  interface  is
              chosen by the system.  imr_ifindex is the interface
              index of the interface that should  join/leave  the
              imr_multiaddr  group,  or  0 to indicate any inter-

              For compatibility, the  old  ip_mreq  structure  is
              still  supported.  It differs from ip_mreqn only by
              not including the imr_ifindex field. Only valid  as
              a setsockopt(2).

              Leave a multicast group. Argument is an ip_mreqn or
              ip_mreq structure similar to IP_ADD_MEMBERSHIP.

              Set the local device for a multicast socket.  Argu-
              ment is an ip_mreqn or ip_mreq structure similar to

              When an invalid socket option  is  passed,  ENOPRO-
              TOOPT is returned.

       The IP protocol supports the sysctl interface to configure
       some global options. The sysctls can be accessed by  read-
       ing or writing the /proc/sys/net/ipv4/* files or using the
       sysctl(2) interface.

              Set the  default  time-to-live  value  of  outgoing
              packets.  This  can  be changed per socket with the
              IP_TTL option.

              Enable IP forwarding with a boolean flag.  IP  for-
              warding can be also set on a per interface basis.

              Enable  dynamic  socket address rewriting on inter-
              face address change.  This  is  useful  for  dialup
              interface with changing IP addresses.

              Not documented.

              Contains two integers that define the default local
              port range allocated to sockets. Allocation  starts
              with the first number and ends with the second num-

              If enabled, don't do Path  MTU  Discovery  for  TCP
              sockets  by default. Path MTU discovery may fail if
              misconfigured firewalls (that drop all  ICMP  pack-
              ets) or misconfigured interfaces (e.g., a point-to-
              point link where the both ends don't agree  on  the
              MTU)  are on the path. It is better to fix the bro-
              ken routers on the path than to turn off  Path  MTU
              Discovery  globally,  because not doing it incurs a
              high cost to the network.

       ipfrag_high_thresh and ipfrag_low_thresh
              If  the  amount  of  queued  IP  fragments  reaches
              ipfrag_high_thresh,  the  queue  is  pruned down to
              ipfrag_low_thresh.  Contains an  integer  with  the
              number of bytes.

       These  ioctls can be accessed using ioctl(2).  The correct
       syntax is:

              error = ioctl(ip_socket, ioctl_type, value_ptr);

              Return a struct timeval with the receive  timestamp
              of the last packet passed to the user. This is use-
              ful for accurate round trip time measurements.  See
              setitimer(2) for a description of struct timeval.

              Set  the  process  or process group (negative value
              passed with a process  group  id  of  the  absolute
              value)  to  send SIGIO or SIGURG signals to when an
              asynchronous I/O operation has finished  or  urgent
              data is available.  Argument is a pid_t.  Only pro-
              cesses with effective user id 0 may set this  value
              to  an  arbitrary process/group id; all others only
              to processes/groups with a matching effective group
              id or user id.

              Set  a  flag to enable or disable asynchronous mode
              of the socket. Asynchronous mode means  that  SIGIO
              is raised when a new I/O event occurs.

              See  socket(4)  for  a  description of the valid IO

              Get the  current  process  or  process  group  that
              receive  SIGIO or SIGURG signals, or 0 when none is
              set. Argument is a pid_t.

       The ioctls to  configure  firewalling  are  documented  in
       ipfw(4) from the ipchains package.

       Ioctls   to   configure   generic  device  parameters  are
       described in netdevice(4).

       Be very careful with the SO_BROADCAST option - it  is  not
       privileged  in  Linux.  It is easy to overload the network
       with careless broadcasts. For new application protocols it
       is  better  to use a multicast group instead of broadcast-
       ing. Broadcasting is discouraged.

       Some other BSD sockets  implementations  provide  IP_RCVD-
       STADDR and IP_RECVIF socket options to get the destination
       address and the interface of received datagrams. Linux has
       the more general IP_PKTINFO for the same task.

               The  operation  is  only  defined  on  a connected
               socket, but the socket wasn't connected.

       EINVAL  Invalid argument passed.

               Datagram is bigger than an MTU on the path and  it
               cannot be fragmented.

       EACCES  The user tried to execute an operation without the
               necessary permissions. These include sending to  a
               broadcast  address  without  having  the broadcast
               flag set, trying to modify the  firewall  settings
               without  effective  user id 0 or CAP_NET_ADMIN, or
               trying to bind to a reserved port  without  effec-
               tive user id 0 or CAP_NET_BIND_SERVICE.

               Tried to bind to an address already in use.

               Not enough memory available.

               Invalid socket option passed.

       EPERM   User doesn't have permission to set high priority,
               change  configuration,  or  send  signals  to  the
               requested process or group,

               A  non-existent  interface  was  requested  or the
               requested source address was not local.

       EAGAIN  Operation on a non-blocking socket would block.

               The socket is not configured or an unknown  socket
               type was requested.

       EISCONN connect(2)  was  called  on  an  already connected

               An connection operation on a  non-blocking  socket
               is already in progress.

               A connection was closed during an accept(2).

       EPIPE   The  connection  was  unexpectedly  closed or shut
               down by the other end.

       ENOENT  SIOCGSTAMP was called on a socket where no  packet

               No  routing  table  entry  matches the destination

       ENODEV  Network device not available  or  not  capable  of
               sending IP.

       ENOPKG  A kernel subsystem was not configured.

       Other errors may be generated by the underlying protocols;
       see tcp(4), raw(4), udp(4) or the generic socket layer.

       IP_RECVERR,  and  IP_ROUTER_ALERT are new options in Linux

       struct ip_mreqn is new in Linux 2.2.  Linux 2.0 only  sup-
       ported ip_mreq.

       The sysctls were introduced with Linux 2.2.

       For   compatibility   with   Linux   2.0,   the   obsolete
       socket(PF_INET, SOCK_RAW, protocol) syntax is  still  sup-
       ported  to open a packet(4) socket. This is deprecated and
       should be replaced by socket(PF_PACKET,  SOCK_RAW,  proto-
       col)  instead.  The main difference is the new sockaddr_ll
       address  structure  for  generic  link  layer  information
       instead of the old sockaddr_pkt.

       There are too many inconsistent error values.

       The  ioctls to configure IP-specific interface options and
       ARP tables are not described.

       This man page was written by Andi Kleen.

       sendmsg(2),  recvmsg(2),  socket(4),  netlink(4),  tcp(4),
       udp(4), raw(4), ipfw(4)

       RFC791, RFC1122, RFC1812

Linux Man Page             24 Dec 1998                          1