Linux系统编程-文件IO

2021 年 5 月 12 日
笔记
linux

1. 无处不在的系统调用
- 1.1 系统调用和库函数的区别？
- 1.2 调用的简单过程
2. C标准库的文件IO函数
3. 系统open、close函数
4. PCB、文件描述符表、文件结构体
5. 系统read、write函数
6. 系统错误处理函数
7. 阻塞、非阻塞
8. lseek 函数
9. fcntl 函数
- 9.1 F_GETFL（get file flags）
- 9.2 F_SETFL（set file flags）
10. ioctl函数

1. 无处不在的系统调用

但凡涉及与资源有关的操作、会影响其他进程的操作，都需要操作系统的介入支持，都需要通过系统调用来实现，其实系统调用从概念上来讲也不难理解。
由操作系统实现并提供给外部应用程序的编程接口(Application Programming Interface，API)，是应用程序同系统之间数据交互的桥梁。

1.1 系统调用和库函数的区别？

系统调用是操作系统向上层提供的接口。
库函数是对系统调用的进一步封装。
应用程序大多是通过高级语言提供的库函数，间接的进行系统调用。

1.2 调用的简单过程

标库函数和系统函数调用过程。

2. C标准库的文件IO函数

fopen、fclose、fseek、fgets、fputs、fread、fwrite……
在命令行，通过 man fopen…… 等可以查看系统定义的对应的标库函数。

2.1 fopen 打开文件

r 只读、r+读写、w只写并截断为0、w+读写并截断为0、a追加只写、a+追加读写。
这些字符串参数 mode 值后面也可以添加b，可以通过 man-pages 看到。

函数 fopen 打开文件名为 path 指向的字符串的文件，将一个流与它关联。

       参数 mode 指向一个字符串，以下列序列之一开始 (序列之后可以有附加的字符):

       r      打开文本文件，用于读。流被定位于文件的开始。

       r+     打开文本文件，用于读写。流被定位于文件的开始。

       w      将文件长度截断为零，或者创建文本文件，用于写。流被定位于文件的开始。

       w+     打开文件，用于读写。如果文件不存在就创建它，否则将截断它。流被定位于文件的开始。

       a      打开文件，用于追加 (在文件尾写)。如果文件不存在就创建它。流被定位于文件的末尾。

       a+     打开文件，用于追加
              (在文件尾写)。如果文件不存在就创建它。读文件的初始位置是文件的开始，但是输出总是被追加到文件
的末尾。

       字符串                       mode                      也可以包含字母                      ``b''
       作为最后一个字符，或者插入到上面提到的任何双字符的字符串的两个字符中间。这样只是为了和      ANSI
       X3.159-1989  (``ANSI  C'')  标准严格保持兼容，没有实际的效果；在所有的遵循 POSIX 的系统中，``b''
       都被忽略，包括        Linux。(其他系统可能将文本文件和二进制文件区别对待，如果在进行二进制文件的
       I/O，那么添加 ``b'' 是个好主意，因为你的程序可能会被移植到非 Unix 环境中。)

2.2 按字符读写 fgetc、fputc

编译运行看对应的输出文件和控制台打印内容。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// 按照字符方式 fgetc(), fputc();
void test01()
{
  // 写文件
  // 可读可写的方式打开文件，没有就创建
  FILE *f_write = fopen("./test01.txt", "w+");
  if (f_write == NULL)
  {
    return;
  }
  char buf[] = "Read and write as characters";
  for (int i = 0; i < strlen(buf); i++)
  {
    fputc(buf[i], f_write);
  }

  // 关闭，会刷新缓冲区
  fclose(f_write);

  // 读文件
  FILE *f_read = fopen("./test01.txt", "r");
  if (f_read == NULL)
  {
    return;
  }
  char ch;
  while ((ch = fgetc(f_read)) != EOF)
  {
    printf("%c", ch);
  }
  fclose(f_read);
}

int main(int argc, char *argv[])
{
  test01();
}

2.3 按行读写 fgets、fputs

编译运行看对应的输出文件和控制台打印内容。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void test02()
{
  // 写入文件
  // 可写的方式打开文件
  FILE *f_write = fopen("./test02.txt", "w");
  if (f_write == NULL)
  {
    return;
  }
  char *buf[] = {
      "hellow world\n",
      "hellow world1\n",
      "hellow world2\n"};
  int len = sizeof(buf) / sizeof(char *);
  for (int i = 0; i < len; i++)
  {
    fputs(buf[i], f_write);
  }
  fclose(f_write);

  // 读取文件
  FILE *f_read = fopen("./test02.txt", "r");
  char *s = NULL;
  while (!feof(f_read))
  {
    char buf[1024] = {0};
    fgets(buf, 1024, f_read);
    printf("%s", buf);
  }
  fclose(f_read);
}

int main(int argc, char *argv[])
{
  test02();
}

2.4 按块读写文件 fread、fwrite

主要针对于自定义的数据类型，可以通过二进制的方式读写。
编译运行看对应的输出文件和控制台打印内容。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// 按照块读写文件（自定义的数据类型，二进制）：fread() fwrite();
void test03()
{
  // 写文件
  FILE *f_write = fopen("./test03.txt", "wb");
  if (f_write == NULL)
  {
    return;
  }
  
  // 自定义结构体类型
  struct Person
  {
    char name[16];
    int age;
  };

  struct Person persons[5] =
      {
          {"zhangsan", 25},
          {"lisi", 25},
          {"wangwu", 25},
          {"zhuliu", 25},
          {"zhuoqi", 25},
      };
  int len = sizeof(persons) / sizeof(struct Person);
  for (int i = 0; i < 5; i++)
  {
    // 参数：数据地址、块的大小、块的个数、文件流
    fwrite(&persons, sizeof(struct Person), 5, f_write);
  }
  fclose(f_write);

  // 读文件
  FILE *f_read = fopen("./test03.txt", "rb");
  if (f_read == NULL)
  {
    return;
  }
  struct Person temp[5];
  fread(&temp, sizeof(struct Person), len, f_read);
  for (int i = 0; i < len; i++)
  {
    printf("name: %s, age: %d \n", temp[i].name, temp[i].age);
  }
}

int main(int argc, char *argv[])
{
  test03();
}

2.5 按格式化读写文件 fprintf、fscanf

编译运行看对应的输出文件和控制台打印内容。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void test04()
{
  // 写文件
  FILE *f_write = fopen("./test04.txt", "w");
  if (f_write == NULL)
  {
    return;
  }
  fprintf(f_write, "hello world %d year - %d - month %d - day", 2008, 8, 8);
  fclose(f_write);

  // 读文件
  FILE *f_read = fopen("./test04.txt", "r");
  if (f_read == NULL)
  {
    return;
  }
  char buf[1024] = {0};

  while (!feof(f_read)) // 直到文件结束标识符，循环结束
  {
    fscanf(f_read, "%s", buf);
    printf("%s ", buf);
  }
  fclose(f_read);
}


int main(int argc, char *argv[])
{
  test04();
}

3. 系统open、close函数

3.1 通过man-pages查看函数

int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
int close(int fd);
参数：文件路径、读写方式、权限设置（一般O_CREAT,权限用8进制，如0664）

3.2 open 中 flags 参数说明

头文件：fcntl.h 中定义
O_RDONLY ：只读
O_WRONLY：只写
O_RDWR：读写
O_APPEND: 追加
O_CREAT: 文件存在就使用，不存在就创建
O_EXCL：文件不存就创建，存在则返回错误信息
O_TRUNC：文件截断为0
O_NONBLOCK：非阻塞的方式操作

3.3 open 中 mode 参数并不是文件真正权限

通过八进制创建文件的权限，在系统当中还要考虑umask。可以命令行运行 umask 进行查看。

标库函数fopen的man-pages中也有关与这个 umask 的提及。

计算公式：新建真实文件权限 = mode & ~umask

如设置mode = 777，此时系统umask = 002，~umask取反得775，那么真实创建出来的文件权限 777 & 775 = 775；

  // 理解过程如下
  文件真实权限
  ⬇
  mode & ~umask
  ⬇
  777 & ~(002)
  ⬇
  777 & 775s
  ⬇
  775

3.4 open常见错误

打开文件不存在
以写方式打开只读文件(打开文件没有对应权限)
以只写方式打开目录

3.5 系统open函数打开文件

编译运行输出

#include <unistd.h> // open close 引入的头文件
#include <fcntl.h>
#include <stdio.h>
#include <errno.h> // errno 需要的头文件

int main(int argc, char *argv[])
{

    // int fd = open("./dict.back", O_RDONLY | O_CREAT | O_TRUNC, 0777);

    int fd = open("./demo.txt", O_RDWR | O_TRUNC);

    printf("fd=%d \n", fd);
    
    // 这里关闭，下面代码中会产errno = 9;
    // close(fd);  
    
    if (fd != -1)
    {
        printf("open success");
    }
    else
    {
        // 出错的时候会产生一个errno, 对应不同的错误细节。
        printf("errno=%d \n", errno);
        printf("open failure");
    }
    
    // close(fd);
    
    return 0;
}

4. PCB、文件描述符表、文件结构体

4.1 文件描述符表、文件结构体、PCB结构体之间的关系图如下

4.2 task_struct 结构体

控制台中可使用命令 locate /include/linux/sched.h，如果没有locate 插件，可以根据系统提示命令行安装。
如定位文件目录为：/usr/src/kernels/3.10.0-1160.11.1.el7.x86_64/include/linux/sched.h。
打开文件可以看到，task_struct 中保存了指向文件描述符表files指针。

4.3 文件描述符表

sched.h 头文件中，PCB 结构体的成员变量 files_struct *file 指向文件描述符表。
从应用程序使用角度，该指针可理解记忆成一个字符指针数组，通过下标 [0/1/2/3/4…] 找到对应的file结构体。
本质是键值对， [0/1/2/3/4…] 分别对应具体file结构体地址。
键值对使用的特性是自动映射的，系统会将自动找到使用下标的文件结构体。
新打开文件，返回文件描述符表中未使用的最小文件描述符，这个系统自动进行管理。
三个文件键是系统是默认打开，如果要用，使用系统定义的宏。
- 0->宏STDIN_FILENO 指向标准输入文件。
- 1->宏STDOUT_FILENO 指向标准输出文件。
- 2->宏STDERR_FILENO 指向标准错误文件。
files_struct 结构体中成员变量，fd_array 为 file描述符数组。

struct files_struct
{
    // 引用累加计数
　　atomic_t count; 
    ...
    // 文件描述符数组
　　struct file * fd_array[NR_OPEN_DEFAULT]; 
｝

4.4 FILE结构体

file结构体主要包含文件描述符、文件读写位置、IO缓冲区三部分内容。
open一个文件，内核就维护一个结构体，用来操作文件。
结构体文件可以命令行定位 locate /include/linux/fs.h。
vim /usr/src/kernels/3.10.0-1160.11.1.el7.x86_64/include/linux/fs.h。

举例说明常用的成员变量

// 文件属性操作函数指针
struct inode            *f_inode;       /* cached value */

// 文件内容操作函数指针
const struct file_operations    *f_op;

// 打开的文件数量
atomic_long_t           f_count;

// O_RDONLY、O_NONBLOCK、O_SYNC（文件的打开标志）
unsigned int            f_flags;

// 文件的访问权限
fmode_t                 f_mode;

// 文件的偏移量
loff_t                  f_pos;

// 文件所有者
struct fown_struct      f_owner;

...

4.5 最大打开文件数

单个进程默认打开文件的个数1024。命令查看unlimit -a 可看到open files 默认为1024。

可以改通过提示的 (-n) 修改当前 shell 进程打开最大文件数量，命令行 ulimit -n 4096。
但是只对当前运行进程生效，如果退出shell进程，再进入查看最大文件数变成原来的值1024

通过修改系统配置文件永久修改该值(不建议)。
vim /etc/security/limits.conf，按照格式要求修改。

cat /proc/sys/fs/file-max 可以查看该电脑最大可以打开的文件个数，受内存大小影响。

5. 系统read、write函数

5.1 通过man-pages查看函数

ssize_t read(int fd, void *buf, size_t count);
ssize_t write(int fd, const void *buf, size_t count);
read与write函数类似，但注意 read、write 函数的第三个参数有所区别。

int main(int argc, char *argv[]) {
    char buf[1024];
    int ret = 0;
    int fd = open("./dict.txt", O_RDWR);
    
    while(( ret = read(fd, buf, sizeof(buf)) ) != 0) {
        wirte(STDOUT_FILENO, buf, ret);
    }
    
    close(fd);
}

5.2 缓冲区的作用

假设我们一次只读一个字节实现文件拷贝功能，使用read、write效率高，还是使用对应的标库函数fgetc、fputc效率高？
根据调用的顺序，标库函数-系统调用-底层设备。调用一次系统函数，有个权级切换，比较耗时。
所以标库函数理论上比系统调用的要快，通过下面两个小节来说明一下。

5.2.1 标库函数fgetc、fputc使用的标库(用户)缓冲区

过程：fgetc –> 库函数缓冲区 –> 系统调用write –> 磁盘
标库函数有自己的缓冲区4096字节。
write（有用户区切换到 kernel 区这样的权级切换，一次刷4096字节）。
示例代码如下通过fget、fputc 实现文件copy功能。

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
    FILE *fp, *fp_out;
    int n;
    
    // 使用标库函数打开
    // r：只读，r+读写
    fp = fopen("./from.txt", "r");
    if (fp == NULL) {
        perror("fopen error");
        exit(1);
    }
    
    // w 只写，并且截断为0，w+ 读写，并且截断为0
    fp_out = fopen("./to.txt", "w");
    if (fp == NULL) {
        perror("fopen error");
    }

    // 先存到库函数去的缓存，4096字节，满了在调用系统函数写入磁盘。
    while ((n = fgetc(fp)) != EOF) {
        fputc(n, fp_out);
    }

    fclose(fp);
    fclose(fp_out);

    return 0;
}

5.2.2 系统调用read、write使用系统缓冲区

过程：系统调用write –> 磁盘
内核也有一个缓冲区，默认大小4096字节。
文件输入，先到缓冲区，充满再刷新到磁盘。
write（user区到kernel区权级切换，每次切换比较耗时。如果一次刷一个字节，切换的次数会特别的多，比较慢）。
read、write函数也可以称为 Unbuffered I/O，指的是无用户级缓冲区。但不保证不使用内核缓冲区。
示例代码如下通过read、write 实现文件copy功能。

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <errno.h>

// buf 缓存的大小。
// #define N 1024
#define N 1

int main(int argc, char *argv[]) {
    int fd, fd_out;
    int n;
    char buf[N];

    fd = open("from.txt", O_RDONLY);
    if (fd < 0) {
        perror("open from.txt error");
        exit(1);
    }

    fd_out = open("to.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
    if (fd < 0) {
        perror("open to.txt error");
        exit(1);
    }

    while ((n = read(fd, buf, N))) {
        if (n < 0) {
            perror("read error");
            exit(1);
        }
        write(fd_out, buf, n);
    }

    close(fd);
    close(fd_out);

    return 0;
}

5.3 系统调用是否能被标库函数完全替代？

既然标库函数减少了权级切换的次数，比系统调用快，但库函数也不能完全可以替代系统调用。
比如需要保持实时性的场景，即时通讯的QQ、微信等软件。

5.4 预输入缓输出

用户区到内核区，权级切换比较耗时。所以通过缓存来提高读写效率。
预输入： 文件Input，如果客户端需要100个字节，系统内核先从磁盘读满缓冲区4096字节(4KB)，下一次读取的时候，从缓冲区里面读取。
缓输出： 文件Output, 如果客户端需要输出100M字节内容到磁盘，先存满内核缓冲区4096字节(4KB)，再由系统内核一次次的刷新到磁盘中。

6. 系统错误处理函数

6.1 exit 函数

头文件：stdlib.h
函数参数可以由开发人员约定，比如0表示正常退出，1表示异常退出。但是系统方法没有强制要求。

...
if (fd < 0) {
  perror("open to.txt error");
  exit(1);  // 1表示异常，有开发人员相互协定
}

while ((n = read(fd, buf, N))) {
    if (n < 0) {
        perror("read error");
        exit(1);  // 1表示异常
    }
    write(fd_out, buf, n);
}

...

6.2 错误编号 errno

对应不同类型错误编号和编号对应的描述。
头文件：errno.h
头文件位置： /usr/include/asm-generic/errno-base.h、/usr/include/asm-generic/errno.h

...
// 如打开文件不存在， 查看errno对应的编号，代码如下
fd = open("test", O_RDONLY);
if (fd < 0)
{
    printf("errno = %d\n", errno);
    exit(1);
}
...

6.3 perror 函数

会把上面errno对应的字符串描述一起拼接上，进行控制台打印。
void perror(const char *s)

...
// 以写方式打开一个目录
// fd = open("testdir", O_RDWR);
fd = open("testdir", O_WRONLY);
if (fd < 0)
{
    perror("open testdir error");
    exit(1);
}
...

6.4 strerror 函数

返回错误编号对应的描述
头文件：string.h
char *strerror(int errnum);

printf ("open testdir error", strerror(errno));

6.5 错误处理的代码示例

#include <unistd.h> //read write
#include <fcntl.h>  //open close O_WRONLY O_RDONLY O_CREAT O_RDWR
#include <stdlib.h> //exit
#include <errno.h>
#include <stdio.h> //perror
#include <string.h>

int main(void)
{
    int fd;
#if 0
    //打开文件不存在
    // fd = open("test", O_RDONLY | O_CREAT);
    fd = open("test", O_RDONLY);
    if (fd < 0)
    {
        printf("errno = %d\n", errno);
        printf("open test error: %s\n", strerror(errno));
        exit(1);
    }
    printf("open success");
#elif 0
    // 打开的文件没有对应权限(以只写方式打开一个只有读权限的文件)
    if (fd < 0)
    {
        fd = open("test", O_WRONLY);
        // fd = open("test", O_RDWR);
        printf("errno = %d\n", errno);
        perror("open test error");
        exit(1);
    }
    printf("open success");

#endif
#if 1
    // 以写方式打开一个目录
    // fd = open("testdir", O_RDWR);
    fd = open("testdir", O_WRONLY);
    if (fd < 0)
    {
        perror("open testdir error");
        exit(1);
    }
#endif

    return 0;
}

7. 阻塞、非阻塞

7.1 阻塞和非阻塞概念

读常规文件是不会阻塞的，不管读多少字节，read一定会在有限的时间内返回。从终端设备或网络读则不一定，如果从终端输入的数据没有换行符，调用read读终端设备就会阻塞，如果网络上没有接收到数据包，调用read从网络读就会阻塞，至于会阻塞多长时间也是不确定的，如果一直没有数据到达就一直阻塞在那里。同样，写常规文件是不会阻塞的，而向终端设备或网络写则不一定。

现在明确一下阻塞（Block）这个概念。当进程调用一个阻塞的系统函数时，该进程被置于睡眠（Sleep）状态，这时内核调度其它进程运行，直到该进程等待的事件发生了（比如网络上接收到数据包，或者调用sleep指定的睡眠时间到了）它才有可能继续运行。与睡眠状态相对的是运行（Running）状态，在Linux内核中，处于运行状态的进程分为两种情况：

正在被调度执行： CPU处于该进程的上下文环境中，程序计数器（eip）里保存着该进程的指令地址，通用寄存器里保存着该进程运算过程的中间结果，正在执行该进程的指令，正在读写该进程的地址空间。

就绪状态： 该进程不需要等待什么事件发生，随时都可以执行，但CPU暂时还在执行另一个进程，所以该进程在一个就绪队列中等待被内核调度。系统中可能同时有多个就绪的进程，那么该调度谁执行呢？内核的调度算法是基于优先级和时间片的，而且会根据每个进程的运行情况动态调整它的优先级和时间片，让每个进程都能比较公平地得到机会执行，同时要兼顾用户体验，不能让和用户交互的进程响应太慢。

7.2 终端设备

文件描述符：STDIN_FILENO、STDOUT_FILE、STDERR_FILENO;
上面三个文件描述符对应都是一个设备文件，/dev/tty。
从控制台输入内容到设备文件，这个过程就是阻塞的，对应STDIN_FILENO，会等待用户输入。
阻塞与非阻塞是对于设备文件而言。

7.3 阻塞读终端

...
   // 默认是用阻塞的方式
   fd = open("/dev/tty", O_RDONLY | O_NONBLOCK);

    if (fd < 0)
    {
        perror("open /dev/tty");
        exit(1);
    }
    else
    {
        printf("fd: %d", fd);
    }
...

7.4 非阻塞读终端(O_NONBLOCK）

#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MSG_TRY "try again\n"

int main(void)
{
    char buf[10];
    int fd, n;

    // 默认是阻塞的方式
    // fd = open("/dev/tty", O_RDONLY);

    // 使用 O_NONBLOCK 标志，设置非阻塞读终端
    fd = open("/dev/tty", O_RDONLY | O_NONBLOCK);

    if (fd < 0)
    {
        perror("open /dev/tty");
        exit(1);
    }
    else
    {
        printf("fd: %d", fd);
    }
tryagain:

    //-1 出错  errno==EAGAIN 或者 EWOULDBLOCK
    n = read(fd, buf, 10);

    if (n < 0)
    {
        // 由于 open 时指定了 O_NONBLOCK 标志，
        // 通过 read 读设备，没有数据到达返回-1，同时将 errno 设置为 EAGAIN 或 EWOULDBLOCK
        
        if (errno != EAGAIN)
        {
            perror("read /dev/tty");
            exit(1);
        }
        sleep(3);
        write(STDOUT_FILENO, MSG_TRY, strlen(MSG_TRY));
        goto tryagain;
    }
    write(STDOUT_FILENO, buf, n);
    close(fd);

    return 0;
}

7.5 非阻塞读终端和等待超时

#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>

#define MSG_TRY "try again\n"
#define MSG_TIMEOUT "time out\n"

int main(int argc, char *argv[])
{
    char buf[10];
    int i;
    int fd;
    int n;
    
    //  使用 NONBLOCK 非阻塞
    fd = open("/dev/tty", O_RDONLY | O_NONBLOCK); 

    if (fd < 0)
    {
        perror("open /dev/tty");
        exit(1);
    }

    printf("open /dev/tty success ... %d \n", fd);

    // timeout
    for (i = 0; i < 5; ++i)
    {
        n = read(fd, buf, 10);
        if (n > 0)
        {
            // 读到了东西，直接跳出循环
            break;
        }
        if (n != EAGAIN)
        {
            // EWOULDBLK
            perror("read /dev/tty");
            exit(1);
        }
        sleep(1);
        write(STDOUT_FILENO, MSG_TRY, strlen(MSG_TRY));
    }
    if (i == 5)
    {
        write(STDOUT_FILENO, MSG_TIMEOUT, strlen(MSG_TIMEOUT));
    }
    else
    {
        write(STDOUT_FILENO, buf, n);
    }
    close(fd);
    return 0;
}

7.6 read 函数返回值

7.6.1 返回 >0

实际读取到的字节数

7.6.2 返回 0

读到文件末尾

7.6.3 返回 -1

errno != EAGAIN(或!= EWOULDBLOCK) read出错
- EAGAIN: enable again，Resource temporarily unavailable 表示资源短暂不可用，这个操作可能等下重试后可用。
- EWOULDBLOCK：用于非阻塞模式，不需要重新读或者写
errno == EAGAIN (或== EWOULDBLOCK) read 正常，只不过没有数据到达而已
- 读取了设备文件，设置了非阻塞读，并且没有数据到达。

8. lseek 函数

8.1 文件偏移

Linux中可使用系统函数lseek来修改文件偏移量(读写位置)。
每个打开的文件都记录着当前读写位置，打开文件时读写位置是0，表示文件开头，通常读写多少个字节就会将读写位置往后移多少个字节。
但是有一个例外，如果以O_APPEND方式打开，每次写操作都会在文件末尾追加数据，然后将读写位置移到新的文件末尾。
lseek和标准I/O库的fseek函数类似，可以移动当前读写位置（或者叫偏移量）。

8.2 标库 fseek 函数

int fseek(FILE *stream, long offset, int whence)
fseek常用参数。 SEEK_SET、SEEK_CUR、SEEK_END
成功返回0；失败返回-1
PS：超出文件末尾位置返回0；往回超出文件头位置，返回-1

8.3 系统 lseek 函数

lseek (int fd, off_t offset, int whence)
lseek常用参数。 SEEK_SET、SEEK_CUR、SEEK_END
失败返回 -1；成功返回较文件起始位置向后的偏移量。
PS：lseek允许超过文件结尾设置偏移量，文件会因此被扩容。并且文件“读”和“写”使用同一偏移位置。

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>

int main(int argc, char *argv[])
{
    int fd;
    int n;
    int ret;

    char msg[] = "It's a test for lseek \n";
    char ch;

    fd = open("lseek.txt", O_RDWR | O_CREAT, 0644);

    if (fd < 0)
    {
        perror("open lseek.txt error");
        exit(1);
    }

    // 使用fd对打开的文件进行写操作，写完光标指针位于文件内容结尾处。
    write(fd, msg, strlen(msg));

    // 将文件内容指针，重置，设置从0开始，偏移12个位置。返回偏移量。
    ret = lseek(fd, 12, SEEK_SET);

    printf("offset len: %d \n", ret);

    while (n = read(fd, &ch, 1))
    {
        if (n < 0)
        {
            perror("read error");
            exit(1);
        }

        // 将文字内容按照字节读出，写到屏幕
        write(STDOUT_FILENO, &ch, n);
    }

    close(fd);
    
    return 0;
}

8.4 lseek 常用操作

文件的读写，使用一个光标指针，写完文件，再去读的话，需要重新设置指针目标。
PS: lseek函数返回的偏移量总是相对于文件头而言。

8.4.1 使用lseek拓展文件

write操作才能实质性的拓展文件。
单单lseek是不能进行拓展的，需要加一次实质性的IO操作。
一般如write(fd, “c”, 1); 加一次实质性的IO操作。
查看文件的16进制表示形式 od -tcx 文件名。
查看文件的10进制表示形式 od -tcd 文件名。

8.4.2 标库 truncate 函数

截断文件到具体specific长度，传入通过文件路径。
int truncate(const char *path, off_t length)。
使用这个方法，文件必须可写。
成功返回0；失败返回-1和设置errno。

8.4.3 系统 ftruncate 函数

截断文件到具体specific长度，传入文件描述符。
使用这个方法，文件必须open，且拥有可写权限。
int ftruncate(int fd, off_t length)。
成功返回0；失败返回-1和设置errno。

8.4.4 通过lseek获取文件的大小

int ret = lseek(fd, 0, SEEK_END);

8.4.5 综合示例代码如下

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>

int main(int argc, char *argv[])
{
    int fd;
    int ret_len;
    int ret_truncate;

    fd = open("lseek.txt", O_RDWR | O_TRUNC | O_CREAT, 0664);
    if (fd < 0)
    {
        perror("open lseek.txt error");
        exit(1);
    }

    // 可以用来文件长度, 从末尾开始，偏移到头。返回偏移量
    ret_len = lseek(fd, 0, SEEK_END);

    if (ret_len == -1)
    {
        perror("lseek error");
        exit(1);
    }

    printf("len of msg = %d\n", ret_len);

    // truncate(const char *path, off_t length) 截断文件到具体长度，文件必须可写, 成功返回0，失败返回-1

    // ftruncate(int fd, off_t length) 截断文件到具体长度，文件必须打开，成功返回0，失败返回-1

    ret_truncate = ftruncate(fd, 1800);

    if (ret_truncate == -1)
    {
        perror("ftruncate error");
        exit(1);
    }

    printf("ftruncate file success, and ret_truncate is %d \n", ret_truncate);
#if 1

    ret_len = lseek(fd, 999, SEEK_SET);
    if (ret_len == -1)
    {
        perror("lseek seek_set error");
        exit(1);
    }

    int ret = write(fd, "a", 1);
    if (ret == -1)
    {
        perror("write error");
        exit(1);
    }

#endif

#if 0
    off_t cur = lseek(fd, -10, SEEK_SET);
    printf(" ****** %ld \n", cur);
    if (cur == -1) {
        perror("lseek error");
        exit(1);
    }
#endif
    close(fd);
    return 0;
}

9. fcntl 函数

头文件 fcntl.h
文件控制 file control，改变一个已经打开的文件的访问控制属性。不需要重新open设置。
int fcntl(int fd, int cmd, … /* arg */ )
两个参数，F_GETFL 和 F_SETFL 重点需要掌握

9.1 F_GETFL（get file flags）

获取文件描述符，对应文件的属性信息

9.2 F_SETFL（set file flags）

设置文件描述符，对应文件的属性信息

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>

#define MSG_TRY "try again \n"

int main(int argc, char *argv[])
{
    char buf[10];
    int flags;
    int n;

    // 获取stdin属性信息
    flags = fcntl(STDIN_FILENO, F_GETFL);
    if (flags == -1)
    {
        perror("fcntl error");
        exit(1);
    }

    // 位或操作，加入非阻塞操作权限(这样文件不用重新通过设置权限的方式打开)
    flags |= O_NONBLOCK;
    
    int ret = fcntl(STDIN_FILENO, F_SETFL, flags);
    if (ret == -1)
    {
        perror("fcntl error");
        exit(1);
    }

tryagain:
    n = read(STDIN_FILENO, buf, 10);
    if (n < 0)
    {
        if (errno != EAGAIN)
        {
            perror("read /dev/tty");
            exit(1);
        }
        sleep(3);
        write(STDOUT_FILENO, MSG_TRY, strlen(MSG_TRY));
        goto tryagain;
    }
    write(STDOUT_FILENO, buf, n);
    return 0;
}

10. ioctl函数

头文件：sys/ioctl.h，文件位置 locate sys/ioctl.h。
主要应用于设备驱动程序中，对设备的I/O通道进行管理，控制设备特性。
通常用来获取文件的物理特性，不同文件类型所含有的特性值各不相同。

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/ioctl.h>

int main(void) {
    // 定义一个包含窗口大小的结构体。
    struct winsize size;
    
    // isatty 如果是不是终端，返回0
    if (isatty(STDOUT_FILENO) == 0) {
        exit(1);
    }
    
    if (ioctl(STDOUT_FILENO, TIOCGWINSZ, &size) < 0) {
        perror("ioctl TIOCGWINSZ error");
        exit(1);
    }
    
    // 输出控制台行和列
    printf("%d rows, %d colums \n", size.ws_row, size.ws_col);
    
    return 0;
}

Tags: linux

Linux系统编程-文件IO

1. 无处不在的系统调用

1.1 系统调用和库函数的区别？

1.2 调用的简单过程

2. C标准库的文件IO函数

2.1 fopen 打开文件

2.2 按字符读写 fgetc、fputc

2.3 按行读写 fgets、fputs

2.4 按块读写文件 fread、fwrite

2.5 按格式化读写文件 fprintf、fscanf

3. 系统open、close函数

3.1 通过man-pages查看函数

3.2 open 中 flags 参数说明

3.3 open 中 mode 参数并不是文件真正权限

3.4 open常见错误

3.5 系统open函数打开文件

4. PCB、文件描述符表、文件结构体

4.1 文件描述符表、文件结构体、PCB结构体之间的关系图如下

4.2 task_struct 结构体

4.3 文件描述符表

4.4 FILE结构体

4.5 最大打开文件数

5. 系统read、write函数

5.1 通过man-pages查看函数

5.2 缓冲区的作用

5.2.1 标库函数fgetc、fputc使用的标库(用户)缓冲区

5.2.2 系统调用read、write使用系统缓冲区

5.3 系统调用是否能被标库函数完全替代？

5.4 预输入缓输出

6. 系统错误处理函数

6.1 exit 函数

6.2 错误编号 errno

6.3 perror 函数

6.4 strerror 函数

6.5 错误处理的代码示例

7. 阻塞、非阻塞

7.1 阻塞和非阻塞概念

7.2 终端设备

7.3 阻塞读终端

7.4 非阻塞读终端(O_NONBLOCK）

7.5 非阻塞读终端和等待超时

7.6 read 函数返回值

7.6.1 返回 >0

7.6.2 返回 0

7.6.3 返回 -1

8. lseek 函数

8.1 文件偏移

8.2 标库 fseek 函数

8.3 系统 lseek 函数

8.4 lseek 常用操作

8.4.1 使用lseek拓展文件

8.4.2 标库 truncate 函数

8.4.3 系统 ftruncate 函数

8.4.4 通过lseek获取文件的大小

8.4.5 综合示例代码如下

9. fcntl 函数

9.1 F_GETFL（get file flags）

9.2 F_SETFL（set file flags）

10. ioctl函数

分享此文：

Related Posts

[源码解析] 并行分布式任务队列 Celery 之 多进程模型

机器学习课程笔记一

详解Tomcat核心配置、http协议

汶川地震13周年：一组卫星图见证浴火重生

VirMach 便宜 VPS

QNews

热门文章

热门搜寻

[源码解析] 并行分布式任务队列 Celery 之多进程模型