中島_backup の履歴(No.20)

B3 前期授業スケジュール

	月曜日	火曜日	水曜日	木曜日	金曜日
1-2					研究会
3-4		卒論1			研究会
5-6	卒論1	ディジタル信号処理	卒論1	卒論1
7-8					技術者倫理
9-10		研究会		&size(px){Text you want to change};
11-12

&ref(): File not found: "ダッシュストーム.jpg" at page "中島"; &ref(): File not found: "ダッシュストーム.jpg" at page "中島"; &ref(): File not found: "ダッシュストーム.jpg" at page "中島"; &ref(): File not found: "ダッシュストーム.jpg" at page "中島"; &ref(): File not found: "ダッシュストーム.jpg" at page "中島"; &ref(): File not found: "ダッシュストーム.jpg" at page "中島";

&ref(): File not found: "栗松.jpg" at page "中島"; &ref(): File not found: "栗松.jpg" at page "中島"; &ref(): File not found: "栗松.jpg" at page "中島"; &ref(): File not found: "栗松.jpg" at page "中島"; &ref(): File not found: "栗松.jpg" at page "中島"; &ref(): File not found: "栗松.jpg" at page "中島";

メモ

やること

import requests
from bs4 import BeautifulSoup
import pandas as pd
import time

# ヘッダー設定
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36"
}

# 賃貸物件情報を格納するリスト
properties = []

# 一覧ページのURLテンプレート
list_page_url_template = "https://suumo.jp/jj/chintai/ichiran/FR301FC005/?ar=040&bs=040&ta=16&sc=16201&cb=0.0&ct=9999999&mb=0&mt=9999999&et=9999999&cn=9999999&shkr1=03&shkr2=03&shkr3=03&shkr4=03&sngz=&po1=25&po2=99&pc=100&page={}"

# 各物件の詳細ページから部屋情報を取得する関数
def get_room_details(detail_url):
    response = requests.get(detail_url, headers=headers)
    soup = BeautifulSoup(response.text, "html.parser")

# 物件概要の情報
    property_details = {"物件URL": detail_url}

# アクセス情報
    access_section = soup.find("div", class_="section_access")
    property_details["アクセス"] = access_section.get_text(separator=", ", strip=True) if access_section else "情報なし"

# 各部屋の情報を取得
    room_list = []
    room_table = soup.find_all("div", class_="cassetteitem_other")
    for room in room_table:
        room_info = property_details.copy()  # 基本情報をコピー
        room_info["家賃"] = room.find("span", class_="cassetteitem_other-emphasis").get_text(strip=True)
        room_info["管理費"] = room.find_all("span", class_="cassetteitem_price--administration")[0].get_text(strip=True)
        room_info["敷金"] = room.find_all("span", class_="cassetteitem_price--deposit")[0].get_text(strip=True)
        room_info["礼金"] = room.find_all("span", class_="cassetteitem_price--gratuity")[0].get_text(strip=True)
        room_info["間取り"] = room.find("span", class_="cassetteitem_madori").get_text(strip=True)
        room_info["専有面積"] = room.find("span", class_="cassetteitem_menseki").get_text(strip=True)
        room_list.append(room_info)

return room_list

# 一覧ページの各物件から部屋情報を取得
for page in range(1, 101):
    list_page_url = list_page_url_template.format(page)
    response = requests.get(list_page_url, headers=headers)
    soup = BeautifulSoup(response.text, "html.parser")

# 一覧ページから物件の詳細URLを取得
    property_links = soup.find_all("a", class_="js-cassette_link_href")

for link in property_links:
        detail_url = "https://suumo.jp" + link.get("href")
        
        # 各物件の部屋情報を取得
        try:
            room_data = get_room_details(detail_url)
            properties.extend(room_data)  # 各部屋情報をリストに追加
            print(f"取得完了: {detail_url}")
        except Exception as e:
            print(f"エラー: {detail_url}, {e}")
        
        # リクエスト間隔を設定（サーバーへの負荷を避けるため）
        time.sleep(1)

# 次の一覧ページに進む前に待機
    time.sleep(3)

# データをCSVに保存
df = pd.DataFrame(properties)
df.to_csv("富山市_賃貸部屋情報.csv", index=False, encoding="utf-8-sig")
print("スクレイピング完了。CSVファイルに保存しました。")

研究会(中島)

専門ゼミ(中島)

引き継ぎ(中島)

メモ(中島)

中間発表(中島)

中間発表システムまとめ(中島)

本論(中島)

B3 前期授業スケジュール